Category: ROCKEN Documentation

  • Retro Sprint 77

    (синяя звезда) General info

    Initiated by

    Andrii Kupriianov (PM)

    Reason

    Finished sprint

    Date

    10.09.2024

    Members

    Andrii Kupriianov Ivan Hodoniuk Dasha Rozhniatovska Anton Poliakov Olexandr Tikan Yurii Tymchuk Viktoriia Malysh Roman Kliuiko

    Link to the board

    Retro Sprint 77 – https://ideaboardz.com/for/Rocken%20Sprint%2077/5376255

    Liked

    Lacked

    Learned

    Longed for

    • set up new production

    • closed a lot of scope

    • high level review of Rocken products

    • collaboration CTO with the team

    • design and implementation of Profile preview PDF

    • Good demo

    • Stated migrations

    • connection new production to Rocken jobs

    • Time for a demo

    • New sentry features

    • Statuses and captions in Confluence

    • Switch off Telescope for a speed

    • Analyze competitors

    • Figured out more in services of Digital Ocean

    • Take part in the demo and demonstrate features

    • Delete “story 1.2.3“ in name of tickets

    (синяя звезда) What needs to be improved?

    • Use more clear naming for user story without numbers
    • Time limitation for demo
  • Elasticsearch

    Elasticsearch Cluster Deployment on DigitalOcean

    Elasticsearch is deployed based on three droplets located on DigitalOcean.
    The Elasticsearch cluster consists of several nodes, each interacting through transport port 9300. Internal communication between cluster nodes is done using the internal network 10.10.1.0/24.
    Request load balancing is performed using a DigitalOcean Internal Load Balancer, accessible via the domain name elasticsearch-lb.prod.rockengroup.com on port 9200.
    Port forwarding is also configured on the load balancer:

    HTTPS on port 9200 -> HTTP on port 9200

    This same port is used for health checks of the droplets (Health checks are configured on port 9200).

    List of Elasticsearch Droplets:

    Droplet Name

    Domain

    IP Address

    elasticsearch-01-prod

    elasticsearch-01.prod.rockengroup.com

    167.172.97.179

    elasticsearch-02-prod

    elasticsearch-02.prod.rockengroup.com

    178.128.207.86

    elasticsearch-03-prod

    elasticsearch-03.prod.rockengroup.com

    46.101.168.210

    Access to the cluster is restricted with a login and password (password can be found in 1Password).
    The cluster is deployed using Docker Compose files (GitLab URL).

    Directory Structure and Description:

    Screenshot from 2024-09-09 10-01-36.png

    certs – folder with certificates used for internal communication between cluster nodes

    docker-compose.yml – Docker Compose file with configuration

    .env – file with variables used in the Docker Compose file

    The connection between nodes is established using SSL.
    Adding new nodes is possible by modifying discovery.seed_hosts and creating a new Docker Compose file for each new node.


    Example of the Docker Compose file for elasticsearch-01-prod:

    version: "3.8"
    
    services:
      elasticsearch-node:
        image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
        container_name: elasticsearch-docker-node-1
        volumes:
          - elastic-data1:/usr/share/elasticsearch/data
          - elastic-config:/usr/share/elasticsearch/config
        ports:
          - ${ES_PORT}:9200
          - ${ES_PORT_TRANSFER}:9300
        environment:
          - node.name=elasticsearch-01
          - cluster.name=elasticsearch-cluster
          - network.host=0.0.0.0
          - network.publish_host=10.10.1.26
          - xpack.security.enabled=true
          - xpack.security.transport.ssl.enabled=true
          - xpack.security.transport.ssl.verification_mode=certificate
          - xpack.security.transport.ssl.keystore.path=/usr/share/elasticsearch/config/certs/elastic-certificates.p12
          - xpack.security.transport.ssl.keystore.password=${CERT_STORE_PASSWORD}
          - xpack.security.transport.ssl.truststore.path=/usr/share/elasticsearch/config/certs/elastic-certificates.p12
          - xpack.security.transport.ssl.truststore.password=${CERT_STORE_PASSWORD}
          - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
          - bootstrap.memory_lock=true
          - discovery.seed_hosts=10.10.1.27:9300,10.10.1.28:9300
          # - cluster.initial_master_nodes=elasticsearch-01,elasticsearch-02,elasticsearch-03
        ulimits:
          memlock:
            soft: -1
            hard: -1
        networks:
          elastic_network:
            ipv4_address: 172.20.1.11
    
    networks:
      elastic_network:
        driver: bridge
        ipam:
          config:
            - subnet: 172.20.1.0/24
    
    volumes:
      elastic-data1:
        driver: local
      elastic-config:
        driver: local

    To add a new node to an existing cluster: Simply copy the docker-compose.yml, cert folder and .env file from one of the existing nodes, and update the node_name, IP address and other variables if needed. The cluster will automatically sync all indexes and settings to the new node.


    Key Parameters Explained:

    • image – docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} – The image used to start the container. The ${STACK_VERSION} variable is stored in the .env file.

    • volumes – Local storage for persistent data and certificates:

      elastic-data1:/usr/share/elasticsearch/data 
      elastic-config:/usr/share/elasticsearch/config/certs

      certfolder with self-signed certificates must be located in elastic-config volume

    • xpack.security.enabled – Enables X-Pack Security to support SSL and password authentication in Elasticsearch. Other options such as xpack.security.transport.ssl.enabled, xpack.security.transport.ssl.verification_mode, etc., configure additional security settings.

    • ELASTIC_PASSWORD – Password for accessing Elasticsearch.

    • bootstrap.memory_lock=true – Prevents Elasticsearch from swapping memory to disk.

    • discovery.seed_hosts – List of IP addresses for other cluster nodes.

    • cluster.initial_master_nodes – Used to initialize the cluster during its first deployment.

    • ulimits – Removes memory limits for Elasticsearch

    • networks – Creates a network using the bridge driver.

    • volumes – Local storage for Elasticsearch data


    Elasticsearch and mmapfs:

    Elasticsearch uses the mmapfs directory by default to store indices. Since the default operating system value may be too low, the Elasticsearch documentation recommends increasing it to 262144.
    To do this, add the parameter vm.max_map_count=262144 to the /etc/sysctl.conf file and execute the command sysctl -p to apply the changes.
    You can also change this value temporarily using the command sysctl -w vm.max_map_count=262144, but this will only be applied until the next system reboot.


    Checking Cluster Status:

    The cluster status can be viewed via the following URL:

    https://elasticsearch-01.prod.rockengroup.com:9200/_cat/nodes

    Elasticsearch Cluster setup

    • Master Node Elections: Elasticsearch uses a master node election process to manage cluster state. With 3 master-eligible nodes, you can achieve a majority quorum (more than half) for decisions, ensuring that the cluster can function even if one master node fails.

    • If 2 out of 3 nodes in an Elasticsearch cluster go offline, the cluster will become inaccessible or unavailable until a majority quorum can be established and data is recovered.

    • Once the second node comes back online, Elasticsearch will begin restoring the cluster state.

    Elasticsearch Backup Strategy

    Snapshot Repository Setup

    • Repository Type: DigitalOcean Spaces (backup-elasticsearch-prod)

    • Repository Name: backup-elasticsearch-prod

    • Configured Location: DigitalOcean Spaces bucket, with a specific base path (es-snapshots).

    • Access: Credentials (access key and secret key) in 1Password DO Space: backup-elasticsearch-prod.

    Automation for Backups

    Scripts for backups are configured on the elasticsearch-01-prod node.

    • Backup Script:

      #!/bin/bash
      
      DATE=$(date +%Y%m%d%H%M%S)
      ES_DIR="/root/elasticsearch"
      ES_HOST="http://localhost:9200"
      REPO_NAME="backup-elasticsearch-prod"
      SNAPSHOT_NAME="snapshot_$DATE"
      ES_PASSWORD=$(grep "^ELASTIC_PASSWORD=" $ES_DIR/.env | cut -d '=' -f2)
      LOG_NAME="$ES_DIR/es_snapshot.log"
      
      # create snapshot
      echo "$(date)" | tee -a $LOG_NAME
      echo "Creating a snapshot with name $SNAPSHOT_NAME" | tee -a $LOG_NAME
      
      # Perform snapshot request
      time curl -u "elastic:$ES_PASSWORD" -X PUT "${ES_HOST}/_snapshot/${REPO_NAME}/${SNAPSHOT_NAME}?wait_for_completion=true&pretty" \
          -H 'Content-Type: application/json' -d'
      {
        "indices": "*",
        "ignore_unavailable": true,
        "include_global_state": true
      }
      ' | tee -a $LOG_NAME
      
      sleep 5
      # check if snapshot is available
      if curl -u "elastic:$ES_PASSWORD" -X GET "${ES_HOST}/_cat/snapshots/${REPO_NAME}?v&s=id&pretty" | grep "$SNAPSHOT_NAME"
      then
          echo "$(date) - Snapshot ${SNAPSHOT_NAME} created" | tee -a $LOG_NAME
      else
          echo "$(date) - !!! Snapshot was not found. Exit !!!"
          exit 1
      fi
      
      echo "" >> $LOG_NAME
    • Execution Schedule:

      The script is scheduled to run automatically every night at 1:00 AM UTC using a cron job.

      0 1 * * * /root/elasticsearch/es_backup.sh
    • Verification

      Snapshots can be verified using the following command:

      curl -u "user:password" -X GET "http://localhost:9200/_cat/snapshots/backup-elasticsearch-prod?v&s=id&pretty"

    Logs from the backup script are stored in /root/elasticsearch/es_snapshot.log.

    Retention Policy

    To manage storage effectively and prevent the repository from filling up with excessive snapshots, a retention policy is implemented to keep only the last 7 backups. Older backups are automatically deleted.

    • Cleanup Script

      #!/bin/bash
      
      ES_DIR="/root/elasticsearch"
      ES_HOST="http://localhost:9200"
      REPO_NAME="backup-elasticsearch-prod"
      ES_PASSWORD=$(grep "^ELASTIC_PASSWORD=" $ES_DIR/.env | cut -d '=' -f2)
      LOG_NAME="$ES_DIR/es_snapshot_cleanup.log"
      
      KEEP_COUNT=7  # Number of snapshots to keep
      
      echo "$(date)" | tee -a $LOG_NAME
      echo "Retrieving snapshot list..." | tee -a $LOG_NAME
      SNAPSHOT_LIST=$(curl -u "elastic:$ES_PASSWORD" -X GET "${ES_HOST}/_cat/snapshots/${REPO_NAME}?h=id&s=start_epoch&pretty" | awk '{print $1}')
      TOTAL_SNAPSHOTS=$(echo "$SNAPSHOT_LIST" | tr ' ' '\n' | wc -l)
      
      # Check if cleanup is needed
      if [ "$TOTAL_SNAPSHOTS" -gt "$KEEP_COUNT" ]; then
        echo "Cleaning up old snapshots. Total: $TOTAL_SNAPSHOTS, Keeping: $KEEP_COUNT" | tee -a $LOG_NAME
        DELETE_COUNT=$((TOTAL_SNAPSHOTS - KEEP_COUNT))
        DELETE_SNAPSHOTS=$(echo "$SNAPSHOT_LIST" | tr ' ' '\n' | head -n "$DELETE_COUNT")
      
        for SNAPSHOT in $DELETE_SNAPSHOTS; do
          echo "Deleting snapshot: $SNAPSHOT" | tee -a $LOG_NAME
          curl -u "elastic:$ES_PASSWORD" -X DELETE "${ES_HOST}/_cat/snapshots/${REPO_NAME}/$SNAPSHOT?pretty"
        done 
      else
        echo "No cleanup needed. Total snapshots: $TOTAL_SNAPSHOTS" | tee -a $LOG_NAME
      fi
      
      echo "" >> $LOG_NAME
    • Execution Schedule:

      The script is scheduled to run automatically every night at 2:00 AM UTC using a cron job.

      0 2 * * * /root/elasticsearch/es_backup_cleanup.sh
    • Verification

      Snapshots can be verified using the following command:

      curl -u "user:password" -X GET "http://localhost:9200/_cat/snapshots/backup-elasticsearch-prod?v&s=id&pretty"

      Verify that only the last 7 snapshots are retained

    • Logs from the cleanup script are stored in /root/elasticsearch/es_snapshot_cleanup.log.