.. _cloud-snapshot-backups: Cloud Snapshot Backups ====================== Barman can perform backups of Postgres servers deployed in specific cloud environments by utilizing snapshots of storage volumes. In this setup, Postgres file backups are represented as volume snapshots stored in the cloud, while Barman functions as the storage server for Write-Ahead Logs (WALs) and the backup catalog. Despite the backup data being stored in the cloud, Barman manages these backups similarly to traditional ones created with ``rsync`` or ``postgres`` backup methods. .. note:: Additionally, snapshot backups can be created without a Barman server by using the ``barman-cloud-backup`` command directly on the Postgres server. Refer to the :ref:`barman cloud client package ` section for more information on how to properly work with this option. .. important:: The following configuration options and equivalent command arguments (if applicable) are not available when using ``backup_method=snapshot``: * ``backup_compression`` * ``bandwidth_limit`` (``--bwlimit``) * ``parallel_jobs`` (``--jobs``) * ``network_compression`` * ``reuse_backup`` (``--reuse-backup``) To configure a backup using snapshot, include the following parameters in the Barman server configuration file: .. code-block:: text backup_method = snapshot snapshot_provider = CLOUD_PROVIDER snapshot_instance = INSTANCE_NAME snapshot_disks = DISK_NAME1,DISK_NAME2 .. important:: Ensure ``snapshot_disks`` includes all disks that store Postgres data. Any data stored on a disk not listed will not be backed up and will be unavailable during recovery. Requirements and Configuration ------------------------------ To use the snapshot backup method with Barman, your deployment must meet these requirements: 1. Postgres must be running on a compute instance provided by a supported cloud provider. 2. All critical data, including PGDATA and tablespace data, must be stored on storage volumes that support snapshots. 3. The ``findmnt`` command must be available on the Postgres host. .. important:: Configuration files stored outside of ``PGDATA`` will not be included in the snapshots. You will need to manage these files separately, using a configuration management system or other mechanisms. Google Cloud Platform --------------------- To use snapshot backups on :term:`GCP` with Barman, please ensure the following: 1. **Python Libraries** Install the ``google-cloud-compute`` and ``grpcio`` libraries for the Python distribution used by Barman. These libraries are optional and not included by default. Install them using pip: .. code:: bash pip3 install grpcio google-cloud-compute .. note:: The ``google-cloud-compute`` library requires Python 3.7 or newer. GCP snapshots are not compatible with earlier Python versions. 2. **Disk Requirements** The disks used in the ``snapshot`` backup must be zonal persistent disks. Regional persistent disks are not supported at this time. 3. **Access Control** Barman needs a service account with specific permissions. You can either attach this account to the compute instance running Barman (recommended) or use the ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable to specify a credentials file. .. important:: Ensure the service account has the permissions listed below: * ``compute.disks.createSnapshot`` * ``compute.disks.get`` * ``compute.globalOperations.get`` * ``compute.instances.get`` * ``compute.snapshots.create`` * ``compute.snapshots.delete`` * ``compute.snapshots.list`` For provider specific credentials configurations, refer to the `Google authentication methods `_ and `service account impersonation `_. 4. **Specific Configuration** The fields ``gcp_project`` and ``gcp_zone`` are configuration options specific to GCP. .. code-block:: text gcp_project = GCP_PROJECT_ID gcp_zone = ZONE Microsoft Azure --------------- To use snapshot backups on Azure with Barman, ensure the following: 1. **Python Libraries** The ``azure-mgmt-compute`` and ``azure-identity`` libraries must be available for the Python distribution used by Barman. These libraries are optional and not included by default. Install them using pip: .. code:: bash pip3 install azure-mgmt-compute azure-identity .. note:: The ``azure-mgmt-compute`` library requires Python 3.7 or later. Azure snapshots are not compatible with earlier Python versions. 2. **Disk Requirements** All disks involved in the snapshot backup must be managed disks attached to the VM instance as data disks. 3. **Access Control** Barman needs to access Azure using credentials obtained via managed identity or CLI login. The following environment variables are supported: ``AZURE_STORAGE_CONNECTION_STRING``, ``AZURE_STORAGE_KEY`` and ``AZURE_STORAGE_SAS_TOKEN``. You can also use the ``--credential`` option to specify either ``default``, ``azure-cli`` or ``managed-identity`` credentials in order to authenticate via Azure Active Directory. .. important:: Ensure the credential has the permissions listed below: * ``Microsoft.Compute/disks/read`` * ``Microsoft.Compute/virtualMachines/read`` * ``Microsoft.Compute/snapshots/read`` * ``Microsoft.Compute/snapshots/write`` * ``Microsoft.Compute/snapshots/delete`` For provider specific credential configurations, refer to the `Azure environment variables configurations `_, `Identity Package `_ and `DefaultAzureCredential documentation `_. 4. **Specific Configuration** The fields ``azure_subscription_id`` and ``azure_resource_group`` are configuration options specific to Azure. .. code-block:: text azure_subscription_id = AZURE_SUBSCRIPTION_ID azure_resource_group = AZURE_RESOURCE_GROUP Amazon Web Services ------------------- To use snapshot backups on :term:`AWS` with Barman, please ensure the following: 1. **Python Libraries** The ``boto3`` library must be available for the Python distribution used by Barman. This library is optional and not included by default. Install it using pip: .. code:: bash pip3 install boto3 2. **Disk Requirements** All disks involved in the snapshot backup must be non-root EBS volumes attached to the same VM instance. 3. **Access Control** Barman needs to access AWS so you must configure the AWS credentials with the ``awscli`` tool as the postgres user, by entering the Access Key and Secret Key that must be previously created in the IAM section of the AWS console. .. important:: Ensure you have the permissions listed below: * ``ec2:CreateSnapshot`` * ``ec2:CreateTags`` * ``ec2:DeleteSnapshot`` * ``ec2:DescribeSnapshots`` * ``ec2:DescribeInstances`` * ``ec2:DescribeVolumes`` For provider specific credentials configurations, refer to the `AWS boto3 configurations `_. 4. **Specific Configuration** The fields ``aws_region``, ``aws_profile`` and ``aws_await_snapshots_timeout`` are configuration options specific to AWS. ``aws_profile`` is the name of the AWS profile in the credentials file. If not used, the default profile will be applied. If no credentials file exists, credentials will come from the environment. ``aws_region`` overrides any region defined in the AWS profile. ``aws_await_snapshots_timeout`` is the timeout for waiting for snapshots to be created (default is ``3600`` seconds). When specifying ``snapshot_instance`` or ``snapshot_disks``, Barman accepts either the instance/volume ID or the name of the resource. If you use a name, Barman will query AWS for resources with a matching ``Name`` tag. If zero or multiple matches are found, Barman will return an error. .. code-block:: text aws_region = AWS_REGION aws_profile = AWS_PROFILE_NAME aws_await_snapshots_timeout = TIMEOUT_IN_SECONDS 5. **Ransomware Protection** Ransomware protection is essential to secure data and maintain operational stability. With Amazon EBS Snapshot Lock, snapshots are protected from deletion, providing an immutable backup that safeguards against ransomware attacks. By locking snapshots, unwanted deletions are prevented, ensuring reliable recovery options in case of compromise. Barman can prevent unwanted deletion of backups by locking the snapshots when creating the backup. .. note:: To delete a locked backup, you must first manually remove the lock in the AWS console. To lock a snapshot during backup creation, you need to configure the following options: 1. Choose the snapshot lock mode: either ``compliance`` or ``governance``. 2. Set either the lock duration or the expiration date (not both). Lock duration is specified in days, ranging from 1 to 36,500. If you choose an expiration date, it must be at least 1 day after the snapshot creation date and time, using the format ``YYYY-MM-DDTHH:MM:SS.sssZ``. 3. Optionally, set a cool-off period (in hours), from 1 to 72. This option only applies when the lock mode is set to ``compliance``. .. code-block:: text aws_snapshot_lock_mode = compliance | governance aws_snapshot_lock_duration = 1 aws_snapshot_lock_cool_off_period = 1 aws_snapshot_lock_expiration_date = "2024-10-07T21:53:00.606Z" .. important:: Ensure you have the permission listed below: * ``ec2:LockSnapshot`` For the concepts behind AWS Snapshot Lock, refer to the `Amazon EBS snapshot lock concepts `_. Backup Process -------------- Here is an overview of the snapshot backup process: 1. Barman performs checks to validate the snapshot options, instance, and disks. Before each backup and during the ``barman check`` command, the following checks are performed: * The compute instance specified by ``snapshot_instance`` and any provider-specific arguments exists. * The disks listed in ``snapshot_disks`` are present. * The disks listed in ``snapshot_disks`` are attached to the ``snapshot_instance``. * The disks listed in ``snapshot_disks`` are mounted on the ``snapshot_instance``. 2. Barman initiates the backup using the Postgres backup API. 3. The cloud provider API is used to create a snapshot for each specified disk. Barman waits until each snapshot reaches a state that guarantees application consistency before proceeding to the next disk. 4. Additional provider-specific details, such as the device name for each disk, and the mount point and options for each disk are recorded in the backup metadata. Metadata ^^^^^^^^ Regardless of whether you provision recovery disks and instances using infrastructure-as-code, ad-hoc automation, or manually, you will need to use Barman to identify the necessary snapshots for a specific backup. You can do this with the barman ``show-backup`` command, which provides details for each snapshot included in the backup. For example: .. code-block:: text Backup 20240813T200506: Server Name : snapshot System Id : 7402620047885836080 Status : DONE PostgreSQL Version : 160004 PGDATA directory : /opt/postgres/data Estimated Cluster Size : 22.7 MiB Server information: Checksums : on Snapshot information: provider : aws account_id : 714574844897 region : sa-east-1 device_name : /dev/sdf snapshot_id : snap-0d2288b4f30e3f9e3 snapshot_name : Barman_AWS:1:/dev/sdf-20240813t200506 Mount point : /opt/postgres Mount options : rw,noatime,seclabel Base backup information: Backup Method : snapshot-concurrent Backup Size : 1.0 KiB (16.0 MiB with WALs) WAL Size : 16.0 MiB Timeline : 1 Begin WAL : 00000001000000000000001A End WAL : 00000001000000000000001A Number of WALs : 1 Begin time : 2024-08-14 16:21:50.820618+00:00 End time : 2024-08-14 16:22:38.264726+00:00 Copy time : 47 seconds Estimated throughput : 22 B/s Begin Offset : 40 End Offset : 312 Begin LSN : 0/1A000028 End LSN : 0/1A000138 WAL information: Number of files : 1 Disk usage : 16.0 MiB WAL rate : 5048.32/hour Last available : 00000001000000000000001B Catalog information: Retention Policy : not enforced Previous Backup : - (this is the oldest base backup) Next Backup : - (this is the latest base backup) The ``--format=json`` option can be used when integrating with external tooling. .. code-block:: json { "snapshots_info": { "provider": "gcp", "provider_info": { "project": "project_id" }, "snapshots": [ { "mount": { "mount_options": "rw,noatime", "mount_point": "/opt/postgres" }, "provider": { "device_name": "pgdata", "snapshot_name": "barman-av-ubuntu20-primary-pgdata-20230123t131430", "snapshot_project": "project_id" } }, { "mount": { "mount_options": "rw,noatime", "mount_point": "/opt/postgres/tablespaces/tbs1" }, "provider": { "device_name": "tbs1", "snapshot_name": "barman-av-ubuntu20-primary-tbs1-20230123t131430", "snapshot_project": "project_id", } } ] } } The metadata found in ``snapshots_info/provider_info`` and ``snapshots_info/snapshots/*/provider`` varies depending on the cloud provider, as detailed in the following sections. **GCP** ``snapshots_info/provider_info`` * ``project``: The GCP project ID of the project which owns the resources involved in backup and recovery. ``snapshots_info/snapshots/*/provider`` * ``device_name``: The short device name with which the source disk for the snapshot was attached to the backup VM at the time of the backup. * ``snapshot_name``: The name of the snapshot. * ``snapshot_project``: The GCP project ID which owns the snapshot. **Azure** ``snapshots_info/provider_info`` * ``subscription_id``: The Azure subscription ID which owns the resources involved in backup and recovery. * ``resource_group``: The Azure resource group to which the resources involved in the backup belong. ``snapshots_info/snapshots/*/provider`` * ``location``: The Azure location of the disk from which the snapshot was taken. * ``lun``: The LUN identifying the disk from which the snapshot was taken at the time of the backup. * ``snapshot_name``: The name of the snapshot. **AWS** ``snapshots_info/provider_info`` * ``account_id``: The ID of the AWS account which owns the resources used to make the backup. * ``region``: The AWS region in which the resources involved in backup are located. ``snapshots_info/snapshots/*/provider`` * ``device_name``: The device to which the source disk was mapped on the backup VM at the time of the backup. * ``snapshot_id``: The ID of the snapshot as assigned by AWS. * ``snapshot_name``: The name of the snapshot.