Backups#
Overview#
The backup command is used to backup an entire Postgres server according to the configuration file parameters. To use it, run:
barman backup [OPTIONS] SERVER_NAME
Note
For detailed information on the backup command, refer to the backup command reference.
Important
Any interaction you plan to have with Barman, you will have to assure that the server is correctly configured. Refer to Quickstart and Configuration Reference sections for the steps you need to cover before trying to create any backup.
Warning
Backup initiation will fail if WAL files are not correctly archived to Barman, either
through the archiver
or the streaming_archiver
options.
Barman offers multiple backup methods for Postgres servers, each with its own approach and requirements.
Prior to version 2.0, Barman relied solely on rsync for both standard backups and file-level incremental backups. Streaming backups were introduced in this version. Starting with version 3.11, Barman also supports block-level incremental backups through the streaming connection.
Important
For Postgres 15 and higher, exclusive
backups are no longer supported. The only
method for taking backups is through concurrent
backup. If backup_options
is
unset, Barman will automatically set it to concurrent_backup
.
Incremental Backups#
Incremental backups involve using an existing backup as a reference to copy only the data changes that have occurred since the last backup on the Postgres server.
The primary objectives of incremental backups in Barman are:
Shorten the duration of the full backup process.
Reduce disk space usage by eliminating redundant data across periodic backups (data deduplication).
Barman supports two types of incremental backups:
File-level incremental backups (using
rsync
)Block-level incremental backups (using
pg_basebackup
with Postgres 17)
Note
Incremental backups of different types are not compatible with each other. For
example, you cannot take a block-level incremental backup on top of an rsync backup,
nor can you take a file-level incremental backup on top of a streaming backup created
with pg_basebackup
.
Managing Bandwidth Usage#
You can control I/O bandwidth usage with the bandwidth_limit
option (global or per
server) by specifying a maximum rate in kilobytes per second. By default, this option is
set to 0
, meaning there is no bandwidth limit.
If you need to manage I/O workload on specific tablespaces, use the
tablespace_bandwidth_limit
option (global or per server) to set limits for
individual tablespaces:
tablespace_bandwidth_limit = tbname:bwlimit[, tbname:bwlimit, ...]
This option takes a comma-separated list of tablespace name and bandwidth limit pairs (in kilobytes per second).
When backing up a server, Barman will check for tablespaces listed in this option. If a matching tablespace is found, the specified bandwidth limit will be applied. If no match is found, the default bandwidth limit for the server will be used.
Important
The bandwidth_limit
option is available with rsync
and postgres
backup
methods, but the tablespace_bandwidth_limit
option is only applicable when using
rsync
.
Network Compression#
You can reduce the size of data transferred over the network by using network compression. This
can be enabled with the network_compression
option (global or per server):
network_compression = true | false
Important
The network_compression
option is not available with the postgres
backup
method.
Setting this option to true
will enable data compression for network transfers
during both backup and recovery. By default, this option is set to false
.
Backup Compression#
Barman supports backup compression using the pg_basebackup
tool. This feature can be
enabled with the backup_compression
option (global or per server).
Important
The backup_compression
option, along with other options discussed here, is only
available with the postgres
backup method.
Compression Algorithms#
Setting the backup_compression
option will compress the backup using the specified
algorithm. Supported algorithms in Barman are: gzip
, lz4
, zstd
, and none
(which results in an uncompressed backup).
backup_compression = gzip | lz4 | zstd | none
Barman requires the corresponding CLI utilities for the selected compression algorithm
to be installed on both the Barman server and Postgres server. These utilities can be
installed via system packages named gzip
, lz4
, and zstd
on Debian, Ubuntu,
RedHat, CentOS, and SLES systems.
On Ubuntu 18.04 (bionic), the
lz4
utility is available in theliblz4-tool
package.lz4
andzstd
are supported with Postgres 15 or higher.
Important
If using backup_compression
, you must also set recovery_staging_path
to
enable recovery of compressed backups. Refer to the
Recovering Compressed backups
section for details.
Compression Workers#
You can use multiple threads to speed up compression by setting the
backup_compression_workers
option (default is 0
):
backup_compression_workers = 2
Note
This option is available only with zstd
compression. zstd
version must be
1.5.0 or higher, or 1.4.4 or higher with multithreading enabled.
Compression Level#
Specify the compression level with the backup_compression_level
option. This should
be an integer value supported by the chosen compression algorithm. If not specified, the
default value for the algorithm will be used.
For
none
compression,backup_compression_level
must be set to0
.The available levels and default values depend on the chosen compression algorithm. Check the backup configuration options section for details.
For Postgres versions prior to 15,
gzip
supports onlybackup_compression_level = 0
, which uses the default compression level.
Compression Location#
For Postgres 15 or higher, you can choose where compression occurs: on the server
or the client
. Set the backup_compression_location
option:
backup_compression_location = server | client
server
: Compression occurs on the Postgres server, reducing network bandwidth but increasing server workload.client
: Compression is handled bypg_basebackup
on the client side.
When backup_compression_location
is set to server
, you can also configure
backup_compression_format
:
backup_compression_format = plain | tar
plain
:pg_basebackup
decompresses data before writing to disk.tar
: Backups are written as compressed tarballs (default).
Depending on the chosen backup_compression
and backup_compression_format
, you
may need to install additional tools on both the Postgres and Barman servers.
Refer to the table below to select the appropriate tools for your configuration.
backup_compression |
backup_compression_format |
Postgres |
Barman |
---|---|---|---|
gzip |
plain |
tar |
None |
gzip |
tar |
tar |
tar |
lz4 |
plain |
tar, lz4 |
None |
lz4 |
tar |
tar, lz4 |
tar, lz4 |
zstd |
plain |
tar, zstd |
None |
zstd |
tar |
tar, zstd |
tar, zstd |
none |
tar |
tar |
tar |
Immediate Checkpoint#
Before starting a backup, Barman requests a checkpoint, which can generate additional workload. By default, this checkpoint is managed according to Postgres’ workload control settings, which may delay the backup.
You can modify this default behavior using the immediate_checkpoint
configuration
option (default is false
).
If immediate_checkpoint
is set to true
, Postgres will perform the checkpoint at
maximum speed without throttling, allowing the backup to begin as quickly as possible.
You can override this configuration at any time by using one of the following options
with the barman backup
command:
--immediate-checkpoint
: Forces an immediate checkpoint.--no-immediate-checkpoint
: Waits for the checkpoint to complete before starting the backup.
Streaming Backup#
Barman can perform a backup of a Postgres server using a streaming connection with
pg_basebackup
.
Important
pg_basebackup
must be installed on the Barman server. It is recommended to use
the latest version of pg_basebackup
as it is backwards compatible. Multiple
versions can be installed and specified using the path_prefix
option in the
configuration file.
To configure streaming backups, set the backup_method
to postgres
:
backup_method = postgres
Block-level Incremental Backup#
This type of backup uses the native incremental backup feature introduced in Postgres 17.
Block-level incremental backups deduplicate data at the page level in Postgres. This means only pages modified since the last backup need to be stored, which is more efficient, especially for large databases with frequent writes.
To perform block-level incremental backups in Barman, use the --incremental
option
with the backup command. You must provide a backup ID or shortcut referencing a previous
backup (full or incremental) created with backup_method=postgres
for deduplication.
Alternatively, you can use last-full
or latest-full
to reference the most recent
eligible full backup in the catalog.
Example command:
barman backup --incremental BACKUP_ID SERVER_NAME
To use block-level incremental backups in Barman, you must:
Use Postgres 17 or later.
This feature relies on WAL Summarization, so
summarize_wal
must be enabled on your database server before taking the initial full backup.Use
backup_method=postgres
.
Note
Compressed backups are currently not supported for block-level incremental backups in Barman.
Important
If you enable data_checksums
between block-level incremental backups, it’s
advisable to take a new full backup. Divergent checksum configurations can
potentially cause issues during recovery.
Backup with Rsync through SSH#
Barman can perform a backup of a Postgres server using Rsync, which uses SSH as a transport mechanism.
To configure a backup using rsync, include the following parameters in the Barman server configuration file:
backup_method = rsync
ssh_command = ssh postgres@pg
Here, backup_method
activates the rsync backup method, and ssh_command
specifies
the SSH connection details from the Barman server to the Postgres server.
Note
Starting with Barman 3.11, a keep-alive mechanism is used for rsync-based backups.
This mechanism sends a simple SELECT 1
query over the libpq connection to
prevent firewall or router disconnections due to idle connections. You can control or
disable this mechanism using the keepalive_interval
configuration option.
File-Level Incremental Backups#
File-level incremental backups rely on rsync and alternatively hard links, so both the operating system and file system where the backup data is stored must support these features.
The core idea is that during a subsequent base backup, files that haven’t changed since the last backup are shared, which saves disk space. This is especially beneficial in VLDB and those with a high percentage of read-only historical tables.
You can enable rsync incremental backups through a global/server option called
reuse_backup
, which manages the Barman backup command. It accepts three values:
off
: Standard full backup (default).link
: File-level incremental backup that reuses the last backup and creates hard links for unchanged files, reducing both backup space and time.copy
: File-level incremental backup that reuses the last backup and creates copies of unchanged files, reducing backup time but not space.
Typically, you would set reuse_backup
to link
as follows:
reuse_backup = link
Setting this at the global level automatically enables incremental backups for all your servers.
You can override this setting with the --reuse-backup
runtime option when running
the Barman backup command. For example, to run a one-off incremental backup, use:
barman backup --reuse-backup=link <server_name>
Note
Unlike block-level incremental backups, rsync file-level incremental backups are
self-contained. If a parent backup is deleted, the integrity of other backups is not
affected. Deduplication in rsync backups uses hard links, meaning that when a reused
backup is deleted, you don’t need to create a new full backup; shared files will
remain on disk until the last backup that used those files is also deleted.
Additionally, using reuse_backup = on
for the initial backup has no effect, as
it will still be treated as a full backup due to the absence of existing files to
link.
Concurrent Backup of a Standby#
When performing a backup from a standby server, ensure the following configuration options are set to point to the standby:
conninfo
streaming_conninfo
(if usingbackup_method = postgres
orstreaming_archiver = on
)ssh_command
(if usingbackup_method = rsync
)wal_conninfo
(connecting to the primary ifconninfo
is pointing to a standby)
The primary_conninfo
option should point to the primary server. Barman will use
primary_conninfo
to trigger a new WAL switch on the primary, allowing the concurrent
backup from the standby to complete without waiting for a natural WAL switch.
Note
It’s crucial to configure primary_conninfo
if backing up a standby during periods
of minimal or no write activity on the primary.
In Barman 3.8.0 and later, if primary_conninfo
is configured, you can also set the
primary_checkpoint_timeout
option. This specifies the maximum wait time (in seconds)
for a new WAL file before Barman forces a checkpoint on the primary. This timeout should
exceed the archive_timeout
value set on the primary.
If primary_conninfo
is not set, the backup will still proceed but will pause at the
stop backup stage until the last archived WAL segment is newer than the latest WAL
required by the backup.
Barman requires that WAL files and backup data originate from the same Postgres cluster. If the standby is promoted to primary, the existing backups and WALs remain valid. However, you should update the Barman configuration to use the new standby for future backups and WAL retrieval.
Note
In case of a failover on the Postgres cluster you can update the Barman configuration with Configuration Models.
WALs can be retrieved from the standby via WAL streaming or WAL archiving. Refer to the concepts section for more details. If you want to start working with WAL streaming or WAL archiving, refer to the quickstart section on streaming backups with wal streaming or rsync backups with wal archiving.
Note
For Postgres 10 and earlier, Barman cannot handle simultaneous WAL streaming and archiving on a standby. You must disable one if the other is in use, as WALs from Postgres 10 and earlier may differ at the binary level, leading to false-positive detection issues in Barman.
Cloud Snapshot Backups#
Barman can perform backups of Postgres servers deployed in specific cloud environments
by utilizing snapshots of storage volumes. In this setup, Postgres file backups are
represented as volume snapshots stored in the cloud, while Barman functions as the
storage server for Write-Ahead Logs (WALs) and the backup catalog. Despite the backup
data being stored in the cloud, Barman manages these backups similarly to traditional
ones created with rsync
or postgres
backup methods.
Note
Additionally, snapshot backups can be created without a Barman server by using the
barman-cloud-backup
command directly on the Postgres server. Refer to the
barman-cli-cloud section for more information
on how to properly work with this option.
Important
The following configuration options and equivalent command arguments (if applicable)
are not available when using backup_method=snapshot
:
backup_compression
bandwidth_limit
(--bwlimit
)parallel_jobs
(--jobs
)network_compression
reuse_backup
(--reuse-backup
)
To configure a backup using snapshot, include the following parameters in the Barman server configuration file:
backup_method = snapshot
snapshot_provider = CLOUD_PROVIDER
snapshot_instance = INSTANCE_NAME
snapshot_disks = DISK_NAME1,DISK_NAME2
Important
Ensure snapshot_disks
includes all disks that store Postgres data. Any data
stored on a disk not listed will not be backed up and will be unavailable during
recovery.
Requirements and Configuration#
To use the snapshot backup method with Barman, your deployment must meet these requirements:
Postgres must be running on a compute instance provided by a supported cloud provider.
All critical data, including PGDATA and tablespace data, must be stored on storage volumes that support snapshots.
The
findmnt
command must be available on the Postgres host.
Important
Configuration files stored outside of PGDATA
will not be included in the snapshots.
You will need to manage these files separately, using a configuration management
system or other mechanisms.
Google Cloud Platform#
To use snapshot backups on GCP with Barman, please ensure the following:
Python Libraries
Install the google-cloud-compute
and grpcio
libraries for the Python
distribution used by Barman. These libraries are optional and not included by default.
Install them using pip:
pip3 install grpcio google-cloud-compute
Note
The google-cloud-compute
library requires Python 3.7 or newer. GCP snapshots are
not compatible with earlier Python versions.
Disk Requirements
The disks used in the snapshot
backup must be zonal persistent disks. Regional
persistent disks are not supported at this time.
Access Control
Barman needs a service account with specific permissions. You can either attach this
account to the compute instance running Barman (recommended) or use the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to specify a credentials
file.
Important
Ensure the service account has the permissions listed below:
compute.disks.createSnapshot
compute.disks.get
compute.globalOperations.get
compute.instances.get
compute.snapshots.create
compute.snapshots.delete
compute.snapshots.list
For provider specific credentials configurations, refer to the Google authentication methods and service account impersonation.
Specific Configuration
The fields gcp_project
and gcp_zone
are configuration options specific to GCP.
gcp_project = GCP_PROJECT_ID
gcp_zone = ZONE
Microsoft Azure#
To use snapshot backups on Azure with Barman, ensure the following:
Python Libraries
The azure-mgmt-compute
and azure-identity
libraries must be available for the
Python distribution used by Barman. These libraries are optional and not included by
default.
Install them using pip:
pip3 install azure-mgmt-compute azure-identity
Note
The azure-mgmt-compute
library requires Python 3.7 or later. Azure snapshots are
not compatible with earlier Python versions.
Disk Requirements
All disks involved in the snapshot backup must be managed disks attached to the VM instance as data disks.
Access Control
Barman needs to access Azure using credentials obtained via managed identity or CLI login.
The following environment variables are supported: AZURE_STORAGE_CONNECTION_STRING
,
AZURE_STORAGE_KEY
and AZURE_STORAGE_SAS_TOKEN
. You can also use the
--credential
option to specify either azure-cli
or managed-identity
credentials in order to authenticate via Azure Active Directory.
Important
Ensure the credential has the permissions listed below:
Microsoft.Compute/disks/read
Microsoft.Compute/virtualMachines/read
Microsoft.Compute/snapshots/read
Microsoft.Compute/snapshots/write
Microsoft.Compute/snapshots/delete
For provider specific credential configurations, refer to the Azure environment variables configurations and Identity Package.
Specific Configuration
The fields azure_subscription_id
and azure_resource_group
are configuration
options specific to Azure.
azure_subscription_id = AZURE_SUBSCRIPTION_ID
azure_resource_group = AZURE_RESOURCE_GROUP
Amazon Web Services#
To use snapshot backups on AWS with Barman, please ensure the following:
Python Libraries
The boto3
library must be available for the Python distribution used by Barman. This
library is optional and not included by default.
Install it using pip:
pip3 install boto3
Disk Requirements
All disks involved in the snapshot backup must be non-root EBS volumes attached to the same VM instance and NVMe volumes are not supported.
Access Control
Barman needs to access AWS so you must configure the AWS credentials with the awscli
tool as the postgres user, by entering the Access Key and Secret Key that must be
previously created in the IAM section of the AWS console.
Important
Ensure you have the permissions listed below:
ec2:CreateSnapshot
ec2:CreateTags
ec2:DeleteSnapshot
ec2:DescribeSnapshots
ec2:DescribeInstances
ec2:DescribeVolumes
For provider specific credentials configurations, refer to the AWS boto3 configurations.
Specific Configuration
The fields aws_region
, aws_profile
and aws_await_snapshots_timeout
are
configuration options specific to AWS.
aws_profile
is the name of the AWS profile in the credentials file. If not used, the
default profile will be applied. If no credentials file exists, credentials will come from
the environment.
aws_region
overrides any region defined in the AWS profile.
aws_await_snapshots_timeout
is the timeout for waiting for snapshots to be created
(default is 3600
seconds).
When specifying snapshot_instance
or snapshot_disks
, Barman accepts either the
instance/volume ID or the name of the resource. If you use a name, Barman will query AWS
for resources with a matching Name
tag. If zero or multiple matches are found,
Barman will return an error.
aws_region = AWS_REGION
aws_profile = AWS_PROFILE_NAME
aws_await_snapshots_timeout = TIMEOUT_IN_SECONDS
Ransomware Protection
Ransomware protection is essential to secure data and maintain operational stability. With Amazon EBS Snapshot Lock, snapshots are protected from deletion, providing an immutable backup that safeguards against ransomware attacks. By locking snapshots, unwanted deletions are prevented, ensuring reliable recovery options in case of compromise. Barman can prevent unwanted deletion of backups by locking the snapshots when creating the backup.
Note
To delete a locked backup, you must first manually remove the lock in the AWS console.
To lock a snapshot during backup creation, you need to configure the following options:
Choose the snapshot lock mode: either
compliance
orgovernance
.Set either the lock duration or the expiration date (not both). Lock duration is specified in days, ranging from 1 to 36,500. If you choose an expiration date, it must be at least 1 day after the snapshot creation date and time, using the format
YYYY-MM-DDTHH:MM:SS.sssZ
.Optionally, set a cool-off period (in hours), from 1 to 72. This option only applies when the lock mode is set to
compliance
.
aws_snapshot_lock_mode = compliance | governance
aws_snapshot_lock_duration = 1
aws_snapshot_lock_cool_off_period = 1
aws_snapshot_lock_expiration_date = "2024-10-07T21:53:00.606Z"
Important
Ensure you have the permission listed below:
ec2:LockSnapshot
For the concepts behing AWS Snapshot Lock, refer to the Amazon EBS snapshot lock concepts.
Backup Process#
Here is an overview of the snapshot backup process:
- Barman performs checks to validate the snapshot options, instance, and disks.
Before each backup and during the
barman check
command, the following checks are performed:The compute instance specified by
snapshot_instance
and any provider-specific arguments exists.The disks listed in
snapshot_disks
are present.The disks listed in
snapshot_disks
are attached to thesnapshot_instance
.The disks listed in
snapshot_disks
are mounted on thesnapshot_instance
.
Barman initiates the backup using the Postgres backup API.
The cloud provider API is used to create a snapshot for each specified disk. Barman waits until each snapshot reaches a state that guarantees application consistency before proceeding to the next disk.
Additional provider-specific details, such as the device name for each disk, and the mount point and options for each disk are recorded in the backup metadata.
Metadata#
Regardless of whether you provision recovery disks and instances using
infrastructure-as-code, ad-hoc automation, or manually, you will need to use Barman to
identify the necessary snapshots for a specific backup. You can do this with the barman
show-backup
command, which provides details for each snapshot included in the
backup.
For example:
Backup 20240813T200506:
Server Name : snapshot
System Id : 7402620047885836080
Status : DONE
PostgreSQL Version : 160004
PGDATA directory : /opt/postgres/data
Estimated Cluster Size : 22.7 MiB
Server information:
Checksums : on
Snapshot information:
provider : aws
account_id : 714574844897
region : sa-east-1
device_name : /dev/sdf
snapshot_id : snap-0d2288b4f30e3f9e3
snapshot_name : Barman_AWS:1:/dev/sdf-20240813t200506
Mount point : /opt/postgres
Mount options : rw,noatime,seclabel
Base backup information:
Backup Method : snapshot-concurrent
Backup Size : 1.0 KiB (16.0 MiB with WALs)
WAL Size : 16.0 MiB
Timeline : 1
Begin WAL : 00000001000000000000001A
End WAL : 00000001000000000000001A
WAL number : 1
Begin time : 2024-08-14 16:21:50.820618+00:00
End time : 2024-08-14 16:22:38.264726+00:00
Copy time : 47 seconds
Estimated throughput : 22 B/s
Begin Offset : 40
End Offset : 312
Begin LSN : 0/1A000028
End LSN : 0/1A000138
WAL information:
No of files : 1
Disk usage : 16.0 MiB
WAL rate : 5048.32/hour
Last available : 00000001000000000000001B
Catalog information:
Retention Policy : not enforced
Previous Backup : - (this is the oldest base backup)
Next Backup : - (this is the latest base backup)
The --format=json
option can be used when integrating with external tooling.
{
"snapshots_info": {
"provider": "gcp",
"provider_info": {
"project": "project_id"
},
"snapshots": [
{
"mount": {
"mount_options": "rw,noatime",
"mount_point": "/opt/postgres"
},
"provider": {
"device_name": "pgdata",
"snapshot_name": "barman-av-ubuntu20-primary-pgdata-20230123t131430",
"snapshot_project": "project_id"
}
},
{
"mount": {
"mount_options": "rw,noatime",
"mount_point": "/opt/postgres/tablespaces/tbs1"
},
"provider": {
"device_name": "tbs1",
"snapshot_name": "barman-av-ubuntu20-primary-tbs1-20230123t131430",
"snapshot_project": "project_id",
}
}
]
}
}
The metadata found in snapshots_info/provider_info
and
snapshots_info/snapshots/*/provider
varies depending on the cloud provider, as
detailed in the following sections.
GCP
snapshots_info/provider_info
project
: The GCP project ID of the project which owns the resources involved in backup and recovery.
snapshots_info/snapshots/*/provider
device_name
: The short device name with which the source disk for the snapshot was attached to the backup VM at the time of the backup.snapshot_name
: The name of the snapshot.snapshot_project
: The GCP project ID which owns the snapshot.
Azure
snapshots_info/provider_info
subscription_id
: The Azure subscription ID which owns the resources involved in backup and recovery.resource_group
: The Azure resource group to which the resources involved in the backup belong.
snapshots_info/snapshots/*/provider
location
: The Azure location of the disk from which the snapshot was taken.lun
: The LUN identifying the disk from which the snapshot was taken at the time of the backup.snapshot_name
: The name of the snapshot.
AWS
snapshots_info/provider_info
account_id
: The ID of the AWS account which owns the resources used to make the backup.region
: The AWS region in which the resources involved in backup are located.
snapshots_info/snapshots/*/provider
device_name
: The device to which the source disk was mapped on the backup VM at the time of the backup.snapshot_id
: The ID of the snapshot as assigned by AWS.snapshot_name
: The name of the snapshot.