Barman (Backup and Recovery Manager) is an open-source administration tool for disaster recovery of PostgreSQL servers written in Python. It allows your organisation to perform remote backups of multiple servers in business critical environments to reduce risk and help DBAs during the recovery phase.

Barman is distributed under GNU GPL 3 and maintained by 2ndQuadrant, a platinum sponsor of the PostgreSQL project.

IMPORTANT: This manual assumes that you are familiar with theoretical disaster recovery concepts, and that you have a grasp of PostgreSQL fundamentals in terms of physical backup and disaster recovery.

Introduction

In a perfect world, there would be no need for a backup. However, it is important, especially in business environments, to be prepared for when the “unexpected” happens. In a database scenario, the unexpected could take any of the following forms:

  • data corruption
  • system failure (including hardware failure)
  • human error
  • natural disaster

In such cases, any ICT manager or DBA should be able to fix the incident and recover the database in the shortest time possible. We normally refer to this discipline as disaster recovery, and more broadly business continuity.

Within business continuity, it is important to familiarise with two fundamental metrics, as defined by Wikipedia:

  • Recovery Point Objective (RPO): “maximum targeted period in which data might be lost from an IT service due to a major incident”
  • Recovery Time Objective (RTO): “the targeted duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity”

In a few words, RPO represents the maximum amount of data you can afford to lose, while RTO represents the maximum down-time you can afford for your service.

Understandably, we all want RPO=0 (“zero data loss”) and RTO=0 (zero down-time, utopia) - even if it is our grandmothers’s recipes book that we are serving. In reality, a careful cost analysis phase allows you to determine your business continuity requirements.

Fortunately, with an open source stack composed of Barman and PostgreSQL, you can achieve RPO=0 thanks to synchronous streaming replication. RTO is more the focus of a High Availability solution, like repmgr. Therefore, by integrating Barman and repmgr, you can dramatically reduce RTO to nearly zero.

Based on our experience at 2ndQuadrant, we can confirm that PostgreSQL open source clusters with Barman and repmgr can easily achieve more than 99.99% uptime over a year, if properly configured and monitored.

In any case, it is important for us to emphasise more on cultural aspects related to disaster recovery, rather than the actual tools. Tools without human beings are useless.

Our mission with Barman is to promote a culture of disaster recovery that:

  • focuses on backup procedures
  • focuses even more on recovery procedures (hence the name Barman with the ‘B’ and the ‘R’)
  • relies on education and training on strong theoretical and practical concepts of PostgreSQL’s crash recovery, backup, Point-In-Time-Recovery, and replication for your team members
  • promotes testing your backups (only a backup that is tested can be considered to be valid), either manually or automatically (be creative with Barman’s hook scripts!)
  • fosters regular practice of recovery procedures, by all members of your devops team (yes, developers too, not just system administrators and DBAs)
  • solicites to regularly scheduled drills and disaster recovery simulations with your team every 3-6 months
  • relies on continuous monitoring of PostgreSQL and Barman, and that is able to promptly identify any anomalies

Moreover, do everything you can to prepare yourself and your team for when the disaster happens (yes, when), because when it happens:

  • It is going to be a Friday evening, most likely right when you are about to leave the office.
  • It is going to be when you are on holiday (right in the middle of your cruise around the world) and somebody else has to deal with it.
  • It is certainly going to be stressful.
  • You will regret not being sure that the last available backup is valid.
  • Unless you know how long it approximately takes to recover, every second will seems like forever.

Be prepared, don’t be scared.

In 2011, with these goals in mind, 2ndQuadrant started the development of Barman, now one of the most used backup tools for PostgreSQL. Barman is an acronym for “Backup and Recovery Manager”.

Currently, Barman works only on Linux and Unix operating systems.

Before you start

Before you start using Barman, it is fundamental that you get familiar with PostgreSQL and the concepts around physical backups, Point-In-Time-Recovery and replication, such as base backups, WAL archiving, etc.

Below you can find a non exhaustive list of resources that we recommend for you to read:

Professional training on these topics is another effective way of learning these concepts. At any time of the year you can find many courses available all over the world, delivered by PostgreSQL companies such as 2ndQuadrant.

Design and architecture

Where to install Barman

One of the foundations of Barman is the ability to operate remotely from the database server, via the network.

Theoretically, you could have your Barman server located in a data centre in another part of the world, thousands of miles away from your PostgreSQL server. Realistically, you do not want your Barman server to be too far from your PostgreSQL server, so that both backup and recovery times are kept under control.

Even though there is no “one size fits all” way to setup Barman, there are a couple of recommendations that we suggest you abide by, in particular:

  • Install Barman on a dedicated server
  • Do not share the same storage with your PostgreSQL server
  • Integrate Barman with your monitoring infrastructure 2
  • Test everything before you deploy it to production

A reasonable way to start modelling your disaster recovery architecture is to:

  • design a couple of possibile architectures in respect to PostgreSQL and Barman, such as:
    1. same data centre
    2. different data centre in the same metropolitan area
    3. different data centre
  • elaborate the pros and the cons of each hypothesis
  • evaluate the single points of failure (SPOF) of your system, with cost-benefit analysis
  • make your decision and implement the initial solution

Having said this, a very common setup for Barman is to be installed in the same data centre where your PostgreSQL servers are. In this case, the single point of failure is the data centre. Fortunately, the impact of such a SPOF can be alleviated thanks to a feature called hook scripts. Indeed, backups of Barman can be exported on different media, such as tape via tar, or locations, like an S3 bucket in the Amazon cloud.

Remember that no decision is forever. You can start this way and adapt over time to the solution that suits you best. However, try and keep it simple to start with.

One Barman, many PostgreSQL servers

Another relevant feature that was first introduced by Barman is support for multiple servers. Barman can store backup data coming from multiple PostgreSQL instances, even with different versions, in a centralised way. 3

As a result, you can model complex disaster recovery architectures, forming a “star schema”, where PostgreSQL servers rotate around a central Barman server.

Every architecture makes sense in its own way. Choose the one that resonates with you, and most importantly, the one you trust, based on real experimentation and testing.

From this point forward, for the sake of simplicity, this guide will assume a basic architecture:

  • one PostgreSQL instance (with host name pg)
  • one backup server with Barman (with host name backup)

Streaming backup vs rsync/SSH

Traditionally, Barman has always operated remotely via SSH, taking advantage of rsync for physical backup operations. Version 2 introduces native support for PostgreSQL’s streaming replication protocol for backup operations, via pg_basebackup. 4

Choosing one of these two methods is a decision you will need to make.

On a general basis, starting from Barman 2.0, backup over streaming replication is the recommended setup for PostgreSQL 9.4 or higher. Moreover, if you do not make use of tablespaces, backup over streaming can be used starting from PostgreSQL 9.2.

IMPORTANT: Because Barman transparently makes use of pg_basebackup, features such as incremental backup, deduplication, and network compression are currently not available. In this case, bandwidth limitation has some restrictions - compared to the traditional method via rsync.

Traditional backup via rsync/SSH is available for all versions of PostgreSQL starting from 8.3, and it is recommended in all cases where pg_basebackup limitations occur (for example, a very large database that can benefit from incremental backup and deduplication).

The reason why we recommend streaming backup is that, based on our experience, it is easier to setup than the traditional one. Also, streaming backup allows you to backup a PostgreSQL server on Windows5, and makes life easier when working with Docker.

Standard archiving, WAL streaming … or both

PostgreSQL’s Point-In-Time-Recovery requires that transactional logs, also known as xlog or WAL files, are stored alongside of base backups.

Traditionally, Barman has supported standard WAL file shipping through PostgreSQL’s archive_command (usually via rsync/SSH). With this method, WAL files are archived only when PostgreSQL switches to a new WAL file. To keep it simple, this normally happens every 16MB worth of data changes.

Barman 1.6.0 introduces streaming of WAL files for PostgreSQL servers 9.2 or higher, as an additional method for transactional log archiving, through pg_receivexlog. WAL streaming is able to reduce the risk of data loss, bringing RPO down to near zero values.

Barman 2.0 introduces support for replication slots with PostgreSQL servers 9.4 or above, therefore allowing WAL streaming-only configurations. Moreover, you can now add Barman as a synchronous WAL receiver in your PostgreSQL 9.5 (or higher) cluster, and achieve zero data loss (RPO=0).

In some cases you have no choice and you are forced to use traditional archiving. In others, you can choose whether to use both or just WAL streaming. Unless you have strong reasons not to do it, we recommend to use both channels, for maximum reliability and robustness.

Two typical scenarios for backups

In order to make life easier for you, below we summarise the two most typical scenarios for a given PostgreSQL server in Barman.

Bear in mind that this is a decision that you must make for every single server that you decide to back up with Barman. This means that you can have heterogeneous setups within the same installation.

As mentioned before, we will only worry about the PostgreSQL server (pg) and the Barman server (backup). However, in real life, your architecture will most likely contain other technologies such as repmgr, pgBouncer, Nagios/Icinga, etc.

Scenario 1: Backup via streaming protocol

If you are using PostgreSQL 9.4 or higher, and your database falls under a general use case scenario, you will likely end up deciding for a streaming backup installation - see figure below.

Streaming-only backup (Scenario 1){ width=10cm }

In this scenario, you will need to configure:

  1. a standard connection to PostgreSQL, for management, coordination, and monitoring purposes
  2. a streaming replication connection that will be used by both pg_basebackup (for base backup operations) and pg_receivexlog (for WAL streaming)

This setup, in Barman’s terminology, is known as streaming-only setup, as it does not require any SSH connection for backup and archiving operations. This is particularly suitable and extremely practical for Docker environments.

However, as mentioned before, you can configure standard archiving as well and implement a more robust architecture - see figure below.

Streaming backup with WAL archiving (Scenario 1b){ width=10cm }

This alternate approach requires:

  • an additional SSH connection that allows the postgres user on the PostgreSQL server to connect as barman user on the Barman server
  • the archive_command in PostgreSQL be configured to ship WAL files to Barman

This architecture is available also to PostgreSQL 9.2/9.3 users that do not use tablespaces.

Scenario 2: Backup via rsync/SSH

The traditional setup of rsync over SSH is the only available option for:

  • PostgreSQL servers version 8.3, 8.4, 9.0 or 9.1
  • PostgreSQL servers version 9.2 or 9.3 that are using tablespaces
  • incremental backup and deduplication
  • network compression during backups
  • finer control of bandwidth usage, including on a tablespace basis

Scenario 2 - Backup via rsync/SSH{ width=10cm }

In this scenario, you will need to configure:

  1. a standard connection to PostgreSQL for management, coordination, and monitoring purposes
  2. an SSH connection for base backup operations to be used by rsync that allows the barman user on the Barman server to connect as postgres user on the PostgreSQL server
  3. an SSH connection for WAL archiving to be used by the archive_command in PostgreSQL and that allows the postgres user on the PostgreSQL server to connect as barman user on the Barman server

Starting from PostgreSQL 9.2, you can add a streaming replication connection that is used for WAL streaming and significantly reduce RPO. This more robust implementation is depicted in figure .

Backup via rsync/SSH with WAL streaming (Scenario 2b){ width=10cm }

System requirements

  • Linux/Unix
  • Python 2.6 or 2.7
  • Python modules:
    • argcomplete
    • argh >= 0.21.2 <= 0.26.2
    • argparse (Python 2.6 only)
    • psycopg2 >= 2.4.2
    • python-dateutil <> 2.0
    • setuptools
  • PostgreSQL >= 8.3
  • rsync >= 3.0.4 (optional for PostgreSQL >= 9.2)

Important: Users of RedHat Enterprise Linux, CentOS and Scientific Linux are required to install the Extra Packages Enterprise Linux (EPEL) repository.

Note: Python 3 support is experimental. Report any bug through the ticketing system on Github or the mailing list.

Requirements for backup

The most critical requirement for a Barman server is the amount of disk space available. You are recommended to plan the required disk space based on the size of the cluster, number of WAL files generated per day, frequency of backups, and retention policies.

Although the only file systems that we officially support are XFS and Ext4, we are aware of users that deploy Barman on different file systems including ZFS and NFS.

Requirements for recovery

Barman allows you to recover a PostgreSQL instance either locally (where Barman resides) or remotely (on a separate server).

Remote recovery is definitely the most common way to restore a PostgreSQL server with Barman.

Either way, the same requirements for PostgreSQL’s Log shipping and Point-In-Time-Recovery apply:

  • identical hardware architecture
  • identical major version of PostgreSQL

In general, it is highly recommended to create recovery environments that are as similar as possible, if not identical, to the original server, because they are easier to maintain. For example, we suggest that you use the same operating system, the same PostgreSQL version, the same disk layouts, and so on.

Additionally, dedicated recovery environments for each PostgreSQL server, even on demand, allows you to nurture the disaster recovery culture in your team. You can be prepared for when something unexpected happens by practising recovery operations and becoming familiar with them.

Based on our experience, designated recovery environments reduce the impact of stress in real failure situations, and therefore increase the effectiveness of recovery operations.

Finally, it is important that time is synchronised between the servers, using NTP for example.

Installation

Important: The recommended way to install Barman is by using the available packages for your GNU/Linux distribution.

Installation on RedHat/CentOS using RPM packages

Barman can be installed on RHEL7, RHEL6 and RHEL5 Linux systems using RPM packages. It is required to install the Extra Packages Enterprise Linux (EPEL) repository beforehand.

RPM packages for Barman are available via Yum through the PostgreSQL Global Development Group RPM repository. You need to follow the instructions for your distribution (for example RedHat, CentOS, or Fedora) and architecture as detailed at yum.postgresql.org.

Then, as root simply type:

yum install barman

2ndQuadrant also maintains RPM packages for Barman and distributes them through Sourceforge.net.

Installation on Debian/Ubuntu using packages

Barman can be installed on Debian and Ubuntu Linux systems using packages.

It is directly available in the official repository for Debian and Ubuntu, however, these repositories might not contain the latest available version. If you want to have the latest version of Barman, the recommended method is to install it through the PostgreSQL Community APT repository. Instructions can be found in the APT section of the PostgreSQL Wiki.

Note: Thanks to the direct involvement of Barman developers in the PostgreSQL Community APT repository project, you will have access to the most updated versions of Barman.

Installing Barman is as easy. As root user simply type:

apt-get install barman

Installation from sources

WARNING: Manual installation of Barman from sources should only be performed by expert GNU/Linux users. Installing Barman this way requires system administration activities such as dependencies management, barman user creation, configuration of the barman.conf file, cron setup for the barman cron command, log management, and so on.

Create a system user called barman on the backup server. As barman user, download the sources and uncompress them.

For a system-wide installation, type:

barman@backup$ ./setup.py build
# run this command with root privileges or through sudo
barman@backup# ./setup.py install

For a local installation, type:

barman@backup$ ./setup.py install --user

The barman application will be installed in your user directory (make sure that your PATH environment variable is set properly).

Barman is also available on the Python Package Index (PyPI) and can be installed through pip.

Configuration

There are two types of configuration files in Barman:

  • global/general configuration
  • server configuration

The main configuration file (set to /etc/barman.conf by default) contains general options such as main directory, system user, log file, and so on.

Server configuration files, one for each server to be backed up by Barman, are located in the /etc/barman.d directory and must have a .conf suffix.

IMPORTANT: For historical reasons, you can still have one single configuration file containing both global and server options. However, for maintenance reasons, this approach is deprecated.

Configuration files in Barman follow the INI format.

Options scope

Every configuration option has a scope:

  • global
  • server
  • global/server: server options that can be generally set at global level

Global options are allowed in the general section, which is identified in the INI file by the [barman] label:

[barman]
; ... global and global/server options go here

Server options can only be specified in a server section, which is identified by a line in the configuration file, in square brackets ([ and ]). The server section represents the ID of that server in Barman. The following example specifies a section for the server named pg:

[pg]
; Configuration options for the
; server named 'pg' go here

There are two reserved words that cannot be used as server names in Barman:

  • barman: identifier of the global section
  • all: a handy shortcut that allows you to execute some commands on every server managed by Barman in sequence

Barman implements the convention over configuration design paradigm, which attempts to reduce the number of options that you are required to configure without losing flexibility. Therefore, some server options can be defined at global level and overridden at server level, allowing users to specify a generic behaviour and refine it for one or more servers. These options have a global/server scope.

For a list of all the available configurations and their scope, please refer to section 5 of the man page.

man 5 barman

Examples of configuration

The following is a basic example of main configuration file:

[barman]
barman_user = barman
configuration_files_directory = /etc/barman.d
barman_home = /var/lib/barman
log_file = /var/log/barman/barman.log
log_level = INFO
compression = gzip

The example below, on the other hand, is a server configuration file that uses streaming backup:

[streaming-pg]
description =  "Example of PostgreSQL Database (Streaming-Only)"
conninfo = host=pg user=barman database=postgres
streaming_conninfo = host=pg user=streaming_barman
backup_method = postgres
streaming_archiver = on
slot_name = barman

The following code shows a basic example of traditional backup using rsync/SSH:

[ssh-pg]
description =  "Example of PostgreSQL Database (via Ssh)"
ssh_command = ssh postgres@pg
conninfo = host=pg user=barman database=postgres
backup_method = rsync
reuse_backup = link
archiver = on

For more detailed information, please refer to the distributed barman.conf file, as well as the ssh-server.conf-template and streaming-server.conf-template template files.

Setup of a new server in Barman

As mentioned in the “Design and architecture” section, we will use the following conventions:

  • pg as server ID and host name where PostgreSQL is installed
  • backup as host name where Barman is located

Preliminary steps

This section contains some preliminary steps that you need to undertake before setting up your PostgreSQL server in Barman.

It is important that you have decided which WAL archiving methods to use, as well as the backup method.

IMPORTANT: Before you proceed, it is important that you have made your decision in terms of WAL archiving and backup strategies, as outlined in the “Design and architecture” section.

PostgreSQL connection

You need to make sure that the backup server can connect to the PostgreSQL server on pg as superuser. This operation is mandatory.

We recommend to create a specific user in PostgreSQL, named barman, as follows:

postgres@pg$ createuser -s -W barman

IMPORTANT: The above command will prompt for a password, which you are then advised to add to the ~barman/.pgpass file on the backup server. For further information, please refer to “The Password File” section in the PostgreSQL Documentation.

This connection is required by Barman in order to coordinate its activities with the server, as well as for monitoring purposes.

You can choose your favourite client authentication method among those offered by PostgreSQL. More information can be found in the “Client Authentication” section of the PostgreSQL Documentation.

Make sure you test the following command before proceeding:

barman@backup$ psql -c 'SELECT version()' -U barman -h pg postgres

NOTE: As of version 1.1.2, Barman honours the application_name connection option for PostgreSQL servers 9.0 or higher.

Write down the above information and keep it for later. You will need it with in the conninfo option for your server configuration, like in this example:

[pg]
; ...
conninfo = host=pg user=barman database=postgres

PostgreSQL streaming connection

In case you plan to use WAL streaming or streaming backup, you need to setup a streaming connection. We recommend to create a specific user in PostgreSQL, named streaming_barman, as follows:

postgres@pg$ createuser -S -W --replication streaming_barman

IMPORTANT: The above command will prompt for a password, which you are then advised to add to the ~barman/.pgpass file on the backup server. For further information, please refer to “The Password File” section in the PostgreSQL Documentation.

Before you proceed, you need to properly configure PostgreSQL on pg to accept streaming replication connections from the Barman server. Please read the following sections in the PostgreSQL documentation:

You can manually verify that the streaming connection works through the following command:

barman@backup$ psql -U streaming_barman -h pg \
  -c "IDENTIFY_SYSTEM" \
  replication=1

Please make sure you are able to connect via streaming replication before going any further.

WWrite down the above information and keep it for later. You will need it with in the streaming_conninfo option for your server configuration, like in this example:

[pg]
; ...
streaming_conninfo = host=pg user=streaming_barman

SSH connections

SSH key exchange is a very common practice that is used to implement secure passwordless connections between users on different machines.

PostgreSQL user’s key

Unless you have done if before, you need to create a SSH key for the PostgreSQL user. Log in the pg host as postgres user and type:

postgres@pg$ ssh-keygen -t rsa

Barman user’s key

Unless you have done if before, you need to create a SSH key for the Barman user. Log in the backup host as barman user and type:

barman@backup$ ssh-keygen -t rsa

From PostgreSQL to Barman

TODO:

  • Explain it is needed for WAL archiving
  • Explain the steps
  • Manual verification

From Barman to PostgreSQL

TODO:

  • Explain it is needed for traditional rsync backup
  • Explain the steps
  • Manual verification

The server configuration file

Create a new file, called pg.conf, in /etc/barman.d directory, with the following content:

[pg]
active = false
description =  "Our main PostgreSQL server"
conninfo = host=pg user=barman database=postgres

The active = false line temporarily disables this server during maintenance operations triggered by the barman cron command, allowing you to continue with the configuration.

The conninfo option is set accordingly to the section “Preliminary steps: PostgreSQL connection”.

WAL streaming

Barman can reduces Recovery Point Objective (RPO) by allowing users to add, on top of the standard archive_command strategy, continuous WAL streaming from a PostgreSQL server.

Barman relies on pg_receivexlog, a utility that is available from PostgreSQL 9.2 which exploits the native streaming replication protocol and continuously receives transaction logs from a PostgreSQL server (be it a master or a standby).

Important: Barman requires that pg_receivexlog is installed in the same server. For PostgreSQL 9.2 servers, you need pg_receivexlog of version 9.2 installed alongside with Barman. For PostgreSQL 9.3 and above, it is recommended to install the latest available version of pg_receivexlog, as it is back compatible. Otherwise, users can install multiple versions of pg_receivexlog in the Barman server and properly point to the specific version for a server, using the path option in the configuration file.

In order to enable streaming of transaction logs, you need to:

  1. setup a streaming connection, as previously described;
  2. set the streaming_archiver option to on.

The cron command, if the aforementioned requirements are met, transparently manages log streaming through the execution of the receive-wal command. This is the recommended scenario.

However, users can manually execute the receive-wal command:

barman receive-wal <server_name>

Note: The receive-wal command is a foreground process.

Transaction logs are streamed directly in the directory specified by the streaming_wals_directory configuration option and are then archived by the archive-wal command.

Unless otherwise specified in the streaming_archiver_name parameter, and only for PostgreSQL 9.3 or above, Barman will set application_name of the WAL streamer process to barman_receive_wal, allowing you to monitor its status in the pg_stat_replication system view of the PostgreSQL server.

Replication slots

TODO:

  • Explain how to configure replication slots, how to create them, etc.
  • Mention streaming-only scenarios

WAL archiving via archive_command

In case you want to setup the traditional WAL file archiving process, Barman requires that PostgreSQL’s archive_command is properly configured on the master.

Important: PostgreSQL 9.5 introduces support for WAL file archiving using archive_command from a standby. This feature is not yet implemented in Barman.

Edit the postgresql.conf file of the PostgreSQL instance on the pg database and activate the archive mode:

archive_mode = on
wal_level = 'replica'
archive_command = 'rsync -a %p barman@backup:INCOMING_WALS_DIRECTORY/%f'

Make sure you change the INCOMING_WALS_DIRECTORY placeholder with the value returned by the barman show-server pg command above.

For PostgreSQL versions older than 9.5, wal_level must be set to hot_standby.

Restart the PostgreSQL server.

In order to test that continuous archiving is on and properly working, you need to check both the PostgreSQL server and the backup server (in particular, that WAL files are correctly collected in the destination directory).

In order to improve the verification of the WAL archiving process, the switch-xlog command has been developed:

barman@backup$ barman switch-xlog --force pg

Streaming backup

TODO:

  • Cover backup_method = postgres

Backup with rsync/SSH

TODO:

  • This is the only available method before 2.0, copy instructions from the old tutorial and adapt

General commands

TODO:

  • Define general commands
  • do an updated inventory of general commands
  • order commands alphabetically
  • Remove all ‘From version …’ paragraph. Assume that we start from scratch with 2.0

cron

You can perform maintenance operations, on both WAL files and backups, using the command:

barman cron

As of version 1.5.1 barman cron executes WAL archiving operations concurrently on a server basis.

This also enforces retention policies on those servers that have:

  • retention_policy not empty and valid;
  • retention_policy_mode set to auto.

Note: This command should be executed in a cron script. Our recommendation is to schedule barman cron to run every minute.

diagnose

TODO

list-server

You can display the list of active servers that have been configured for your backup system with:

barman list-server

Server commands

TODO:

  • Define server commands
  • do an updated inventory of server commands
  • order commands alphabetically
  • Remove all ‘From version …’ paragraph. Assume that we start from scratch with 2.0

backup

You can perform a full backup (base backup) for a given server with:

barman backup [--immediate-checkpoint] <server_name>

Tip: You can use barman backup all to sequentially backup all your configured servers.

check

You can check if the connection to a given server is properly working with:

barman check <server_name>

Tip: You can use barman check all to check all your configured servers.

From version 1.3.3, you can automatically be notified if the latest backup of a given server is older than, for example, 7 days.6

Barman introduces the option named last_backup_maximum_age having the following syntax:

last_backup_maximum_age = {value {DAYS | WEEKS | MONTHS}}

where value is a positive integer representing the number of days, weeks or months of the time frame.

get-wal

From version 1.5.0, Barman allows users to request any xlog file from its WAL archive through the get-wal command:

barman get-wal [-o OUTPUT_DIRECTORY] [-j|-x] <server_name> <wal_id>

If the requested WAL file is found in the server archive, the uncompressed content will be returned to STDOUT, unless otherwise specified.

The following options are available for the get-wal command:

  • -o allows users to specify a destination directory where Barman will deposit the requested WAL file
  • -j will compress the output using bzip2 algorithm
  • -x will compress the output using gzip algorithm
  • -p SIZE peeks from the archive up to WAL files, starting from the requested one.

It is possible to use get-wal during a recovery operation, transforming the Barman server in a WAL hub for your servers. This can be automatically achieved by adding the get-wal value to the recovery_options global/server configuration option:

recovery_options = 'get-wal'

TODO: Rewrite this with barman-wal-restore

recovery_options is a global/server option that accepts a list of comma separated values. If the keyword get-wal is present, during a recovery operation Barman will prepare the recovery.conf file by setting the restore_command so that barman get-wal is used to fetch the required WAL files.

This is an example of a restore_command for a remote recovery:

restore_command = 'ssh barman@pgbackup barman get-wal SERVER %f > %p'

This is an example of a restore_command for a local recovery:

restore_command = 'barman get-wal SERVER %f > %p'

Important: Even though recovery_options aims to automate the process, using the get-wal facility requires manual intervention and proper testing.

list-backup

You can list the catalogue of available backups for a given server with:

barman list-backup <server_name>

rebuild-xlogdb

At any time, you can regenerate the content of the WAL archive for a specific server (or every server, using the all shortcut). The WAL archive is contained in the xlog.db file, and every Barman server has its own copy. From version 1.2.4 you can now rebuild the xlog.db file with the rebuild-xlogdb command. This will scan all the archived WAL files and regenerate the metadata for the archive.

Important: Users of Barman < 1.2.3 might have suffered from a bug due to bad locking in highly concurrent environments. You can now regenerate the WAL archive using the rebuild-xlogdb command.

barman rebuild-xlogdb <server_name>

show-server

You can show the configuration parameters for a given server with:

barman show-server <server_name>

Backup commands

TODO:

  • Define backup commands
  • do an updated inventory of server commands
  • order commands alphabetically
  • Remove all ‘From version …’ paragraph. Assume that we start from scratch with 2.0

Note: Remember: a backup ID can be retrieved with barman list-backup <server_name>

TODO: Shortcuts

delete

You can delete a given backup with:

barman delete <server_name> <backup_id>

From version 1.1.2, in order to delete the oldest backup, you can issue:

barman delete <server_name> oldest

list-files

You can list the files (base backup and required WAL files) for a given backup with:

barman list-files [--target TARGET_TYPE] <server_name> <backup_id>

With the --target TARGET_TYPE option, it is possible to choose the content of the list for a given backup.

Possible values for TARGET_TYPE are:

  • data: lists just the data files;
  • standalone: lists the base backup files, including required WAL files;
  • wal: lists all WAL files from the beginning of the base backup to the start of the following one (or until the end of the log);
  • full: same as data + wal.

The default value for TARGET_TYPE is standalone.

Important: The list-files command facilitates interaction with external tools, and therefore can be extremely useful to integrate > Barman into your archiving procedures.

recover

TODO

show-backup

You can show all the available information for a particular backup of a given server with:

barman show-backup <server_name> <backup_id>

From version 1.1.2, in order to show the latest backup, you can issue:

barman show-backup <server_name> latest

Features in detail

Incremental backup

From version 1.4.0, Barman implements file-level incremental backup. Incremental backup is a kind of full periodic backup which saves only data changes from the latest full backup available in the catalogue for a specific PostgreSQL server. It must not be confused with differential backup, which is implemented by WAL continuous archiving.

The main goals of incremental backup in Barman are:

  • Reduce the time taken for the full backup process
  • Reduce the disk space occupied by several periodic backups (data deduplication)

This feature heavily relies on rsync and hard links, which must be therefore supported by both the underlying operating system and the file system where the backup data resides.

The main concept is that a subsequent base backup will share those files that have not changed since the previous backup, leading to relevant savings in disk usage. This is particularly true of VLDB contexts and, more in general, of those databases containing a high percentage of read-only historical tables.

Barman implements incremental backup through a global/server option, called reuse_backup, that transparently manages the barman backup command. It accepts three values:

  • off: standard full backup (default)
  • link: incremental backup, by reusing the last backup for a server and creating a hard link of the unchanged files (for backup space and time reduction)
  • copy: incremental backup, by reusing the last backup for a server and creating a copy of the unchanged files (just for backup time reduction)

The most common scenario is to set reuse_backup to link, as follows:

reuse_backup = link

Setting this at global level will automatically enable incremental backup for all your servers.

As a final note, users can override the setting of the reuse_backup option through the --reuse-backup runtime option for the barman backup command. Similarly, the runtime option accepts three values: off, link and copy. For example, you can run a one-off incremental backup as follows:

barman backup --reuse-backup=link <server_name>

WAL compression

The barman cron command (see below) will compress WAL files if the compression option is set in the configuration file. This option allows five values:

  • bzip2: for Bzip2 compression (requires the bzip2 utility)
  • gzip: for Gzip compression (requires the gzip utility)
  • pybzip2: for Bzip2 compression (uses Python’s internal compression module)
  • pygzip: for Gzip compression (uses Python’s internal compression module)
  • pigz: for Pigz compression (requires the pigz utility)
  • custom: for custom compression, which requires you to set the following options as well:
    • custom_compression_filter: a compression filter
    • custom_decompression_filter: a decompression filter

NOTE: The pybzip2, pygzip and pigz options for standard compression have been introduced in Barman 1.6.0. All methods but pybzip2 and pygzip require barman archive-wal to fork a new process.

Limiting bandwidth usage

From version 1.2.1, it is possible to limit the usage of I/O bandwidth through the bandwidth_limit option (global/per server), by specifying the maximum number of kilobytes per second. By default it is set to 0, meaning no limit.

In case you have several tablespaces and you prefer to limit the I/O workload of your backup procedures on one or more tablespaces, you can use the tablespace_bandwidth_limit option (global/per server):

tablespace_bandwidth_limit = tbname:bwlimit[, tbname:bwlimit, ...]

The option accepts a comma separated list of pairs made up of the tablespace name and the bandwidth limit (in kilobytes per second).

When backing up a server, Barman will try and locate any existing tablespace in the above option. If found, the specified bandwidth limit will be enforced. If not, the default bandwidth limit for that server will be applied.

Network Compression

From version 1.3.0 it is possible to reduce the size of transferred data using compression. It can be enabled using the network_compression option (global/per server):

network_compression = true|false

Setting this option to true will enable data compression during network transfers (for both backup and recovery). By default it is set to false.

Backup ID shortcuts

As of version 1.1.2, you can use any of the following shortcuts to identify a particular backup for a given server:

  • latest: the latest available backup for that server, in chronological order. You can also use the last synonym.
  • oldest: the oldest available backup for that server, in chronological order. You can also use the first synonym.

These aliases can be used with any of the following commands: show-backup, delete, list-files and recover.

Minimum redundancy safety

From version 1.2.0, you can define the minimum number of periodic backups for a PostgreSQL server.

You can use the global/per server configuration option called minimum_redundancy for this purpose, by default set to 0.

By setting this value to any number greater than 0, Barman makes sure that at any time you will have at least that number of backups in a server catalogue.

This will protect you from accidental barman delete operations.

Important: Make sure that your policy retention settings do not collide with minimum redundancy requirements. Regularly check Barman’s log for messages on this topic.

Retention policies

From version 1.2.0, Barman supports retention policies for backups.

A backup retention policy is an user-defined policy that determines how long backups and related archive logs (Write Ahead Log segments) need to be retained for recovery procedures.

Based on the user’s request, Barman retains the periodic backups required to satisfy the current retention policy, and any archived WAL files required for the complete recovery of those backups.

Barman users can define a retention policy in terms of backup redundancy (how many periodic backups) or a recovery window (how long).

Retention policy based on redundancy

In a redundancy based retention policy, the user determines how many periodic backups to keep. A redundancy-based retention policy is contrasted with retention policies that use a recovery window.

Retention policy based on recovery window

A recovery window is one type of Barman backup retention policy, in which the DBA specifies a period of time and Barman ensures retention of backups and/or archived WAL files required for point-in-time recovery to any time during the recovery window. The interval always ends with the current time and extends back in time for the number of days specified by the user. For example, if the retention policy is set for a recovery window of seven days, and the current time is 9:30 AM on Friday, Barman retains the backups required to allow point-in-time recovery back to 9:30 AM on the previous Friday.

Scope

Retention policies can be defined for:

  • PostgreSQL periodic base backups: through the retention_policy configuration option;
  • Archive logs, for Point-In-Time-Recovery: through the wal_retention_policy configuration option.

Important: In a temporal dimension, archive logs must be included in the time window of periodic backups.

There are two typical use cases here: full or partial point-in-time recovery.

Full point in time recovery scenario

Base backups and archive logs share the same retention policy, allowing DBAs to recover at any point in time from the first available backup.

Partial point in time recovery scenario

Base backup retention policy is wider than that of archive logs, allowing users for example to keep full weekly backups of the last 6 months, but archive logs for the last 4 weeks (granting to recover at any point in time starting from the last 4 periodic weekly backups).

Important: Currently, Barman implements only the full point in time recovery scenario, by constraining the wal_retention_policy option to main.

How they work

Retention policies in Barman can be:

  • automated: enforced by barman cron;
  • manual: Barman simply reports obsolete backups and allows DBAs to delete them.

Important: Currently Barman does not implement manual enforcement. This feature will be available in future versions.

Configuration and syntax

Retention policies can be defined through the following configuration options:

  • retention_policy: for base backup retention;
  • wal_retention_policy: for archive logs retention;
  • retention_policy_mode: can only be set to auto (retention policies are automatically enforced by the barman cron command).

These configuration options can be defined both at a global level and a server level, allowing users maximum flexibility on a multi-server environment.

Syntax for retention_policy

The general syntax for a base backup retention policy through retention_policy is the following:

retention_policy = {REDUNDANCY value | RECOVERY WINDOW OF value {DAYS | WEEKS | MONTHS}}

Where:

  • syntax is case insensitive;
  • value is an integer and is > 0;
  • in case of redundancy retention policy:
    • value must be greater than or equal to the server minimum redundancy level (if not is is assigned to that value and a warning is generated);
    • the first valid backup is the value-th backup in a reverse ordered time series;
  • in case of recovery window policy:
    • the point of recoverability is: current time - window;
    • the first valid backup is the first available backup before the point of recoverability; its value in a reverse ordered time series must be greater than or equal to the server minimum redundancy level (if not is is assigned to that value and a warning is generated).

By default, retention_policy is empty (no retention enforced).

Syntax for wal_retention_policy

Currently, the only allowed value for wal_retention_policy is the special value main, that maps the retention policy of archive logs to that of base backups.

Concurrent Backup and backup from a standby

Normally, during backup operations, Barman uses PostgreSQL native functions pg_start_backup and pg_stop_backup for exclusive backup. These operations are not allowed on a read-only standby server.

As of version 1.3.1, Barman is also capable of performing backups of PostgreSQL from 9.2 or greater database servers in a concurrent way, primarily through the backup_options configuration parameter.7

This introduces a new architecture scenario with Barman: backup from a standby server, using rsync.

Important: Concurrent backup requires users of PostgreSQL 9.2, 9.3, 9.4, and 9.5 to install the pgespresso open source extension on every PostgreSQL server of the cluster. For more detailed information and the source code, please visit the pgespresso extension website. As of version 2.0, Barman adds support to the new API introduced in PostgreSQL 9.6. This removes the requirement of the pgespresso extension to perform concurrent backups altogether.

By default, backup_options is transparently set to exclusive_backup (the only supported method by any Barman version prior to 1.3.1).

When backup_options is set to concurrent_backup, Barman activates the concurrent backup mode for a server and follows these two simple rules:

  • ssh_command must point to the destination Postgres server;
  • conninfo must point to a database on the destination Postgres database. Using PostgreSQL 9.2, 9.3, 9.4, and 9.5 pgespresso must be correctly installed through CREATE EXTENSION. Using 9.6 or greater, concurrent backups are executed through the Postgres native API.

The destination Postgres server can be either the master or a streaming replicated standby server.

Note: When backing up from a standby server, continuous archiving of WAL files must be configured on the master to ship files to the Barman server (as outlined in the “Continuous WAL archiving” section above)8.

Hook scripts

Barman allows a database administrator to run hook scripts on these two events:

  • before and after a backup
  • before and after a WAL file is archived

There are two types of hook scripts that Barman can manage:

  • standard hook scripts (already present in Barman since version 1.1.0)
  • retry hook scripts, introduced in version 1.5.0

The only difference between these two types of hook scripts is that Barman executes a standard hook script only once, without checking its return code, whereas a retry hook script may be executed more than once depending on its return code.

Precisely, when executing a retry hook script, Barman checks the return code and retries indefinitely until the script returns either SUCCESS (with standard return code 0), or ABORT_CONTINUE (return code 62), or ABORT_STOP (return code 63). Barman treats any other return code as a transient failure to be retried. Users are given more power: a hook script can control its workflow by specifying whether a failure is transient. Also, in case of a ‘pre’ hook script, by returning ABORT_STOP, users can request Barman to interrupt the main operation with a failure.

Hook scripts are executed in the following order:

  1. The standard ‘pre’ hook script (if present)
  2. The retry ‘pre’ hook script (if present)
  3. The actual event (i.e. backup operation, or WAL archiving), if retry ‘pre’ hook script was not aborted with ABORT_STOP
  4. The retry ‘post’ hook script (if present)
  5. The standard ‘post’ hook script (if present)

The output generated by any hook script is written in the log file of Barman.

Note: Currently, ABORT_STOP is ignored by retry ‘post’ hook scripts. In these cases, apart from lodging an additional warning, ABORT_STOP will behave like ABORT_CONTINUE.

Backup scripts

Version 1.1.0 introduced backup scripts.

These scripts can be configured with the following global configuration options (which can be overridden on a per server basis):

  • pre_backup_script: hook script executed before a base backup, only once, with no check on the exit code
  • pre_backup_retry_script: retry hook script executed before a base backup, repeatedly until success or abort
  • post_backup_retry_script: retry hook script executed after a base backup, repeatedly until success or abort
  • post_backup_script: hook script executed after a base backup, only once, with no check on the exit code

The script definition is passed to a shell and can return any exit code. Only in case of a retry script, Barman checks the return code (see the upper section).

The shell environment will contain the following variables:

  • BARMAN_BACKUP_DIR: backup destination directory
  • BARMAN_BACKUP_ID: ID of the backup
  • BARMAN_CONFIGURATION: configuration file used by barman
  • BARMAN_ERROR: error message, if any (only for the post phase)
  • BARMAN_PHASE: phase of the script, either pre or post
  • BARMAN_PREVIOUS_ID: ID of the previous backup (if present)
  • BARMAN_RETRY: 1 if it is a retry script (from 1.5.0), 0 if not
  • BARMAN_SERVER: name of the server
  • BARMAN_STATUS: status of the backup
  • BARMAN_VERSION: version of Barman (from 1.2.1)

WAL archive scripts

Version 1.3.0 introduced WAL archive hook scripts.

Similarly to backup scripts, archive scripts can be configured with global configuration options (which can be overridden on a per server basis):

  • pre_archive_script: hook script executed before a WAL file is archived by maintenance (usually barman cron), only once, with no check on the exit code
  • pre_archive_retry_script: retry hook script executed before a WAL file is archived by maintenance (usually barman cron), repeatedly until success or abort
  • post_archive_retry_script: retry hook script executed after a WAL file is archived by maintenance, repeatedly until success or abort
  • post_archive_script: hook script executed after a WAL file is archived by maintenance, only once, with no check on the exit code

The script is executed through a shell and can return any exit code. Only in case of a retry script, Barman checks the return code (see the upper section).

Archive scripts share with backup scripts some environmental variables:

  • BARMAN_CONFIGURATION: configuration file used by barman
  • BARMAN_ERROR: error message, if any (only for the post phase)
  • BARMAN_PHASE: phase of the script, either pre or post
  • BARMAN_SERVER: name of the server

Following variables are specific to archive scripts:

  • BARMAN_SEGMENT: name of the WAL file
  • BARMAN_FILE: full path of the WAL file
  • BARMAN_SIZE: size of the WAL file
  • BARMAN_TIMESTAMP: WAL file timestamp
  • BARMAN_COMPRESSION: type of compression used for the WAL file

Customisation of lock file directory

Since version 1.5.0, Barman allows DBAs to specify a directory for lock files through the barman_lock_directory global option.

Lock files are used to coordinate concurrent work at global and server level (for example, cron operations, backup operations, access to the WAL archive, etc.).

By default (for backward compatibility reasons), barman_lock_directory is set to barman_home.

Important: This change won’t affect users upgrading from a version of Barman older than 1.5.0, unless you have written applications that depend on the names of the lock files. However, this is not a typical and common case for Barman and most of users do not fall into this category.

Tip: Users are encouraged to use a directory in a volatile partition, such as the one dedicated to run-time variable data (e.g. /var/run/barman).

Customisation of binary paths

As of version 1.6.0, Barman allows users to specify one or more directories where Barman looks for executable files, using the global/server option path_prefix.

If a path_prefix is provided, it must contain a list of one or more directories separated by colon. Barman will search inside these directories first, then in those specified by the PATH environment variable.

By default the path_prefix option is empty.

Integration with standby servers

Barman has been designed for integration with standby servers (with streaming replication or traditional file based log shipping) and high availability tools like repmgr.

From an architectural point of view, PostgreSQL must be configured to archive WAL files directly to the Barman server.

Version 1.6.1 introduces the replication-status command which allows users to get information about any streaming client attached to the managed server, in particular hot standby servers and WAL streamers.

Synchronous WAL streaming

TODO - Explain how to get RPO=0

Troubleshooting

Diagnose a Barman installation

You can gather important information about all the configured server using:

barman diagnose

The diagnose command also provides other useful information, such as global configuration, SSH version, Python version, rsync version, PostgreSQL clients version, as well as current configuration and status of all servers.

Requesting for help

TODO: Mention the mailing list

Submitting a bug

Barman has been extensively tested, and is currently being used in several production environments. However, as any software, Barman is not bug free.

If you discover a bug, please follow this procedure:

  • execute the barman diagnose command;
  • file a bug through the Github issue tracker, by attaching the output obtained by the diagnostics command above (barman diagnose).

WARNING: Be careful when submitting the output of the diagnose command as it might disclose information that are potentially dangerous from a security point of view.

The Barman project

Support and sponsor opportunities

Barman is free software, written and maintained by 2ndQuadrant. If you require support on using Barman, or if you need new features, please get in touch with 2ndQuadrant. You can sponsor the development of new features of Barman and PostgreSQL which will be made publicly available as open source.

For further information, please visit:

Contributing to Barman

2ndQuadrant has a team of software engineers, architects, database administrators, system administrators, QA engineers, developers and managers that dedicate their time and expertise to improve Barman’s code. We adopt lean and agile methodologies for software development, and we believe in the devops culture that allowed us to implement rigorous testing procedures through cross-functional collaboration. Every Barman commit is the contribution of multiple individuals, at different stages of the production pipeline.

Even though this is our preferred way of developing Barman, we gladly accept patches from external developers, as long as:

  • user documentation (tutorial and man pages) is provided;
  • source code is properly documented and contains relevant comments;
  • code supplied is covered by unit tests;
  • no unrelated feature is compromised or broken;
  • source code is rebased on the current master branch;
  • commits and pull requests are limited to a single feature (multi-feature patches are hard to test and review);
  • changes to the user interface are discussed beforehand with 2ndQuadrant.

We also require that any contributions provide a copyright assignment and a disclaimer of any work-for-hire ownership claims from the employer of the developer.

You can use Github’s pull requests system for this purpose.

Authors

In alphabetical order:

  • Gabriele Bartolini (project leader)
  • Jonathan Battiato (QA/testing)
  • Stefano Bianucci (developer)
  • Giuseppe Broccolo (QA/testing)
  • Giulio Calacoci (developer)
  • Francesco Canovai (QA/testing)
  • Leonardo Cecchi (developer)
  • Gianni Ciolli (QA/testing)
  • Britt Cole (documentation)
  • Marco Nenciarini (lead developer)
  • Rubens Souza (QA/testing)

Past contributors:

  • Carlo Ascani

License and Contributions

Barman is the exclusive property of 2ndQuadrant Italia and its code is distributed under GNU General Public License 3.

Copyright (C) 2011-2016 2ndQuadrant.it S.r.l..

Barman has been partially funded through 4CaaSt, a research project funded by the European Commission’s Seventh Framework programme.

Contributions to Barman are welcome, and will be listed in the AUTHORS file. 2ndQuadrant Italia requires that any contributions provide a copyright assignment and a disclaimer of any work-for-hire ownership claims from the employer of the developer. This lets us make sure that all of the Barman distribution remains free code. Please contact info@2ndQuadrant.it for a copy of the relevant Copyright Assignment Form.

Feature matrix

Below you will find a matrix of PostgreSQL versions and Barman features for backup and archiving:

Version Backup with rsync/SSH Backup with pg_basebackup Standard WAL archiving WAL Streaming RPO=0
9.6 Yes Yes Yes Yes Yes
9.5 Yes Yes Yes Yes Yes (d)
9.4 Yes Yes Yes Yes Yes (d)
9.3 Yes Yes (c) Yes Yes (b) No
9.2 Yes Yes (a)(c) Yes Yes (a)(b) No
9.1 Yes No Yes No No
9.0 Yes No Yes No No
8.4 Yes No Yes No No
8.3 Yes No Yes No No

Note:

  1. pg_basebackup and pg_receivexlog 9.2 required
  2. WAL streaming-only not supported (standard archiving required)
  3. Backup of tablespaces not supported
  4. When using pg_receivexlog 9.5, minor version 9.5.5 or higher required 9

It is required by Barman that pg_basebackup and pg_receivexlog of the same version of the PostgreSQL server (or higher) are installed on the same server where Barman resides. The only exception is that PostgreSQL 9.2 users are required to install version 9.2 of pg_basebackup and pg_receivexlog alongside with Barman.

TIP: We recommend that the last major, stable version of the PostgreSQL clients (e.g. 9.6) is installed on the Barman server if you plan to use backup and WAL archiving over streaming replication through pg_basebackup and pg_receivexlog, for PostgreSQL 9.3 or higher servers.

TIP: For “RPO=0” architectures, it is recommended to have at least one synchronous standby server.


  1. It is important that you know the difference between logical and physical backup, therefore between pg_dump and a tool like Barman.

  2. Integration with Nagios/Icinga is straightforward thanks to the barman check --nagios command, one of the most important features of Barman and a true lifesaver.

  3. The same [requirements for PostgreSQL’s PITR] requirements_recovery apply for recovery.

  4. Check in the “Feature matrix” which PostgreSQL versions support streaming replication backups with Barman.

  5. Backup of a PostgreSQL server on Windows is possible, but it is still experimental because it is not yet part of our continuous integration system.

  6. This feature is commonly known among the development team members as smelly backup check.

  7. Concurrent backup is a technology that has been available in PostgreSQL since version 9.1, through the streaming replication protocol (using, for example, a tool like pg_basebackup).

  8. In case of concurrent backup, currently Barman does not have a way to determine that the closing WAL file of a full backup has actually been shipped - opposite to the case of an exclusive backup where it is Postgres itself that makes sure that the WAL file is correctly archived. Be aware that the full backup cannot be considered consistent until that WAL file has been received and archived by Barman. We encourage Barman users to wait to delete the previous backup - at least until that moment.

  9. The commit “Fix pg_receivexlog –synchronous” is required (included in version 9.5.5)