Cloud Snapshot Backups#

Barman can perform backups of Postgres servers deployed in specific cloud environments by utilizing snapshots of storage volumes. In this setup, Postgres file backups are represented as volume snapshots stored in the cloud, while Barman functions as the storage server for Write-Ahead Logs (WALs) and the backup catalog. Despite the backup data being stored in the cloud, Barman manages these backups similarly to traditional ones created with rsync or postgres backup methods.

Note

Additionally, snapshot backups can be created without a Barman server by using the barman-cloud-backup command directly on the Postgres server. Refer to the barman cloud client package section for more information on how to properly work with this option.

Important

The following configuration options and equivalent command arguments (if applicable) are not available when using backup_method=snapshot:

  • backup_compression

  • bandwidth_limit (--bwlimit)

  • parallel_jobs (--jobs)

  • network_compression

  • reuse_backup (--reuse-backup)

To configure a backup using snapshot, include the following parameters in the Barman server configuration file:

backup_method = snapshot
snapshot_provider = CLOUD_PROVIDER
snapshot_instance = INSTANCE_NAME
snapshot_disks = DISK_NAME1,DISK_NAME2

Important

Ensure snapshot_disks includes all disks that store Postgres data. Any data stored on a disk not listed will not be backed up and will be unavailable during recovery.

Requirements and Configuration#

To use the snapshot backup method with Barman, your deployment must meet these requirements:

  1. Postgres must be running on a compute instance provided by a supported cloud provider.

  2. All critical data, including PGDATA and tablespace data, must be stored on storage volumes that support snapshots.

  3. The findmnt command must be available on the Postgres host.

Important

Configuration files stored outside of PGDATA will not be included in the snapshots. You will need to manage these files separately, using a configuration management system or other mechanisms.

Google Cloud Platform#

To use snapshot backups on GCP with Barman, please ensure the following:

  1. Python Libraries

Install the google-cloud-compute and grpcio libraries for the Python distribution used by Barman. These libraries are optional and not included by default.

Install them using pip:

pip3 install grpcio google-cloud-compute

Note

The google-cloud-compute library requires Python 3.7 or newer. GCP snapshots are not compatible with earlier Python versions.

  1. Disk Requirements

The disks used in the snapshot backup must be zonal persistent disks. Regional persistent disks are not supported at this time.

  1. Access Control

Barman needs a service account with specific permissions. You can either attach this account to the compute instance running Barman (recommended) or use the GOOGLE_APPLICATION_CREDENTIALS environment variable to specify a credentials file.

Important

Ensure the service account has the permissions listed below:

  • compute.disks.createSnapshot

  • compute.disks.get

  • compute.globalOperations.get

  • compute.instances.get

  • compute.snapshots.create

  • compute.snapshots.delete

  • compute.snapshots.list

For provider specific credentials configurations, refer to the Google authentication methods and service account impersonation.

  1. Specific Configuration

The fields gcp_project and gcp_zone are configuration options specific to GCP.

gcp_project = GCP_PROJECT_ID
gcp_zone = ZONE

Microsoft Azure#

To use snapshot backups on Azure with Barman, ensure the following:

  1. Python Libraries

The azure-mgmt-compute and azure-identity libraries must be available for the Python distribution used by Barman. These libraries are optional and not included by default.

Install them using pip:

pip3 install azure-mgmt-compute azure-identity

Note

The azure-mgmt-compute library requires Python 3.7 or later. Azure snapshots are not compatible with earlier Python versions.

  1. Disk Requirements

All disks involved in the snapshot backup must be managed disks attached to the VM instance as data disks.

  1. Access Control

Barman needs to access Azure using credentials obtained via managed identity or CLI login.

The following environment variables are supported: AZURE_STORAGE_CONNECTION_STRING, AZURE_STORAGE_KEY and AZURE_STORAGE_SAS_TOKEN. You can also use the --credential option to specify either default, azure-cli or managed-identity credentials in order to authenticate via Azure Active Directory.

Important

Ensure the credential has the permissions listed below:

  • Microsoft.Compute/disks/read

  • Microsoft.Compute/virtualMachines/read

  • Microsoft.Compute/snapshots/read

  • Microsoft.Compute/snapshots/write

  • Microsoft.Compute/snapshots/delete

For provider specific credential configurations, refer to the Azure environment variables configurations, Identity Package and DefaultAzureCredential documentation.

  1. Specific Configuration

The fields azure_subscription_id and azure_resource_group are configuration options specific to Azure.

azure_subscription_id = AZURE_SUBSCRIPTION_ID
azure_resource_group = AZURE_RESOURCE_GROUP

Amazon Web Services#

To use snapshot backups on AWS with Barman, please ensure the following:

  1. Python Libraries

The boto3 library must be available for the Python distribution used by Barman. This library is optional and not included by default.

Install it using pip:

pip3 install boto3
  1. Disk Requirements

All disks involved in the snapshot backup must be non-root EBS volumes attached to the same VM instance.

  1. Access Control

Barman needs to access AWS so you must configure the AWS credentials with the awscli tool as the postgres user, by entering the Access Key and Secret Key that must be previously created in the IAM section of the AWS console.

Important

Ensure you have the permissions listed below:

  • ec2:CreateSnapshot

  • ec2:CreateTags

  • ec2:DeleteSnapshot

  • ec2:DescribeSnapshots

  • ec2:DescribeInstances

  • ec2:DescribeVolumes

For provider specific credentials configurations, refer to the AWS boto3 configurations.

  1. Specific Configuration

The fields aws_region, aws_profile and aws_await_snapshots_timeout are configuration options specific to AWS.

aws_profile is the name of the AWS profile in the credentials file. If not used, the default profile will be applied. If no credentials file exists, credentials will come from the environment.

aws_region overrides any region defined in the AWS profile.

aws_await_snapshots_timeout is the timeout for waiting for snapshots to be created (default is 3600 seconds).

When specifying snapshot_instance or snapshot_disks, Barman accepts either the instance/volume ID or the name of the resource. If you use a name, Barman will query AWS for resources with a matching Name tag. If zero or multiple matches are found, Barman will return an error.

aws_region = AWS_REGION
aws_profile = AWS_PROFILE_NAME
aws_await_snapshots_timeout = TIMEOUT_IN_SECONDS
  1. Ransomware Protection

Ransomware protection is essential to secure data and maintain operational stability. With Amazon EBS Snapshot Lock, snapshots are protected from deletion, providing an immutable backup that safeguards against ransomware attacks. By locking snapshots, unwanted deletions are prevented, ensuring reliable recovery options in case of compromise. Barman can prevent unwanted deletion of backups by locking the snapshots when creating the backup.

Note

To delete a locked backup, you must first manually remove the lock in the AWS console.

To lock a snapshot during backup creation, you need to configure the following options:

  1. Choose the snapshot lock mode: either compliance or governance.

  2. Set either the lock duration or the expiration date (not both). Lock duration is specified in days, ranging from 1 to 36,500. If you choose an expiration date, it must be at least 1 day after the snapshot creation date and time, using the format YYYY-MM-DDTHH:MM:SS.sssZ.

  3. Optionally, set a cool-off period (in hours), from 1 to 72. This option only applies when the lock mode is set to compliance.

aws_snapshot_lock_mode = compliance | governance
aws_snapshot_lock_duration = 1
aws_snapshot_lock_cool_off_period = 1
aws_snapshot_lock_expiration_date = "2024-10-07T21:53:00.606Z"

Important

Ensure you have the permission listed below:

  • ec2:LockSnapshot

For the concepts behind AWS Snapshot Lock, refer to the Amazon EBS snapshot lock concepts.

Backup Process#

Here is an overview of the snapshot backup process:

  1. Barman performs checks to validate the snapshot options, instance, and disks.

    Before each backup and during the barman check command, the following checks are performed:

    • The compute instance specified by snapshot_instance and any provider-specific arguments exists.

    • The disks listed in snapshot_disks are present.

    • The disks listed in snapshot_disks are attached to the snapshot_instance.

    • The disks listed in snapshot_disks are mounted on the snapshot_instance.

  2. Barman initiates the backup using the Postgres backup API.

  3. The cloud provider API is used to create a snapshot for each specified disk. Barman waits until each snapshot reaches a state that guarantees application consistency before proceeding to the next disk.

  4. Additional provider-specific details, such as the device name for each disk, and the mount point and options for each disk are recorded in the backup metadata.

Metadata#

Regardless of whether you provision recovery disks and instances using infrastructure-as-code, ad-hoc automation, or manually, you will need to use Barman to identify the necessary snapshots for a specific backup. You can do this with the barman show-backup command, which provides details for each snapshot included in the backup.

For example:

Backup 20240813T200506:
  Server Name            : snapshot
  System Id              : 7402620047885836080
  Status                 : DONE
  PostgreSQL Version     : 160004
  PGDATA directory       : /opt/postgres/data
  Estimated Cluster Size : 22.7 MiB

  Server information:
    Checksums            : on

  Snapshot information:
    provider             : aws
    account_id           : 714574844897
    region               : sa-east-1

    device_name          : /dev/sdf
    snapshot_id          : snap-0d2288b4f30e3f9e3
    snapshot_name        : Barman_AWS:1:/dev/sdf-20240813t200506
    Mount point          : /opt/postgres
    Mount options        : rw,noatime,seclabel

  Base backup information:
    Backup Method        : snapshot-concurrent
    Backup Size          : 1.0 KiB (16.0 MiB with WALs)
    WAL Size             : 16.0 MiB
    Timeline             : 1
    Begin WAL            : 00000001000000000000001A
    End WAL              : 00000001000000000000001A
    Number of WALs       : 1
    Begin time           : 2024-08-14 16:21:50.820618+00:00
    End time             : 2024-08-14 16:22:38.264726+00:00
    Copy time            : 47 seconds
    Estimated throughput : 22 B/s
    Begin Offset         : 40
    End Offset           : 312
    Begin LSN            : 0/1A000028
    End LSN              : 0/1A000138

  WAL information:
    Number of files      : 1
    Disk usage           : 16.0 MiB
    WAL rate             : 5048.32/hour
    Last available       : 00000001000000000000001B

  Catalog information:
    Retention Policy     : not enforced
    Previous Backup      : - (this is the oldest base backup)
    Next Backup          : - (this is the latest base backup)

The --format=json option can be used when integrating with external tooling.

{
  "snapshots_info": {
    "provider": "gcp",
    "provider_info": {
      "project": "project_id"
    },
    "snapshots": [
      {
        "mount": {
          "mount_options": "rw,noatime",
          "mount_point": "/opt/postgres"
        },
        "provider": {
          "device_name": "pgdata",
          "snapshot_name": "barman-av-ubuntu20-primary-pgdata-20230123t131430",
          "snapshot_project": "project_id"
        }
      },
      {
        "mount": {
          "mount_options": "rw,noatime",
          "mount_point": "/opt/postgres/tablespaces/tbs1"
        },
        "provider": {
          "device_name": "tbs1",
          "snapshot_name": "barman-av-ubuntu20-primary-tbs1-20230123t131430",
          "snapshot_project": "project_id",
        }
      }
    ]
  }
}

The metadata found in snapshots_info/provider_info and snapshots_info/snapshots/*/provider varies depending on the cloud provider, as detailed in the following sections.

GCP

snapshots_info/provider_info

  • project: The GCP project ID of the project which owns the resources involved in backup and recovery.

snapshots_info/snapshots/*/provider

  • device_name: The short device name with which the source disk for the snapshot was attached to the backup VM at the time of the backup.

  • snapshot_name: The name of the snapshot.

  • snapshot_project: The GCP project ID which owns the snapshot.

Azure

snapshots_info/provider_info

  • subscription_id: The Azure subscription ID which owns the resources involved in backup and recovery.

  • resource_group: The Azure resource group to which the resources involved in the backup belong.

snapshots_info/snapshots/*/provider

  • location: The Azure location of the disk from which the snapshot was taken.

  • lun: The LUN identifying the disk from which the snapshot was taken at the time of the backup.

  • snapshot_name: The name of the snapshot.

AWS

snapshots_info/provider_info

  • account_id: The ID of the AWS account which owns the resources used to make the backup.

  • region: The AWS region in which the resources involved in backup are located.

snapshots_info/snapshots/*/provider

  • device_name: The device to which the source disk was mapped on the backup VM at the time of the backup.

  • snapshot_id: The ID of the snapshot as assigned by AWS.

  • snapshot_name: The name of the snapshot.