PostgreSQL: Disaster Recovery Plan

Written by

in

ROCKEN Documentation

This document outlines the disaster recovery setup for the PostgreSQL cluster deployed as a managed database on DigitalOcean. The configuration ensures high availability, data integrity, and quick recovery in the event of an incident.

Common Failure Scenarios and Recovery Steps

1: Primary Node Failure

Impact

The primary node becomes unavailable, affecting write operations.

Recovery Steps

Promote an existing read-only node to become the primary node of the database cluster.
```
doctl databases replica promote <database-cluster-id> <replica-name> [flags]
```
Update DNS record in prod.rockengroup.comdomain with a new connection strings if the primary node changes due to failover.
Provision a new read-only node to maintain redundancy.

2: Cluster Failure

Impact

Both the primary and read-only nodes are unavailable, resulting in a complete loss of database access.

Recovery Steps

Initiate the restoration process through DigitalOcean interface. Use the PostgreSQL: Digital Ocean restore DB document for restoring
Ensure the restored databases are consistent and free of corruption.
Update connection string in prod.rockengroup.com domain to point to the newly restored cluster.
Set up a new read-only node for redundancy.

3. Data Failure

Impact

Data corruption or accidental deletion affects database integrity.

Recovery Steps

Use Snapshooter (PostgreSQL: Snapshooter restore DB) to restore the database to a point-in-time before the data failure occurred:
- go to backup-snapshooter-prod Space and find necessary DB archive
- download MANUAL_RESTORE.txt file and follow its recommendations to restore the DB .
Check the restored data for consistency and completeness.

Comments

Leave a Reply Cancel reply

You must be logged in to post a comment.

More posts