barman.wal_archiver module#

class barman.wal_archiver.CloudWalArchiver(backup_manager)View on GitHub#

Bases: object

WAL archiver for cloud storage destinations.

This archiver uploads WAL files directly to cloud storage rather than storing them on the local filesystem. It supports parallel prefetching of additional WAL files that are ready for archival, improving throughput when Postgres archive_command invokes this archiver.

Unlike other archivers like FileWalArchiver and StreamingWalArchiver, which archive WALs from incoming and streaming directories on the Barman, host, this class is used to archive WALs living on the Postgres server itself i.e. WALs on the pg_wal directory.

Variables:

LAST_ARCHIVED_CACHE_FILE – Name of the cache file that stores the last archived WAL name.

LAST_ARCHIVED_CACHE_FILE = 'cloud-wal-last-archived'#
__init__(backup_manager)View on GitHub#

Initialize the cloud WAL archiver.

Parameters:

backup_manager (barman.backup_manager.BackupManager) – The backup manager of the server in use.

_archive_single_wal(wal_info)View on GitHub#

Archive/Upload a single WAL file to cloud storage.

Compresses and uploads the WAL file using the configured storage strategy. Handles duplicate detection: if an identical file already exists in cloud storage, the upload is silently skipped; if a different file exists with the same name, the local file is copied to the errors directory.

Parameters:

wal_info (barman.infofile.WalFileInfo) – Metadata for the WAL file to archive.

_build_wal_info(wal_path)View on GitHub#

Create a WalFileInfo object for a given wal_path.

Parameters:

wal_path (str) – Full path to the WAL file.

Returns:

A WalFileInfo instance populated with metadata from the file.

Return type:

barman.infofile.WalFileInfo

_copy_wal_file_to_errors_directory(src, file_name, suffix)View on GitHub#

Copy a problematic WAL file to the errors directory.

Unlike Server.move_wal_file_to_errors_directory(), this method copies rather than moves the file, preserving the original. This is necessary when archiving directly from pg_wal, since PostgreSQL owns those files and Barman should not remove them.

Parameters:
  • src (str) – Full path to the source WAL file.

  • file_name (str) – Base name of the WAL file (used for the destination name).

  • suffix (str) – Suffix to append to the destination file name (e.g., "duplicate" or "unknown").

_get_wals_to_prefetch(requested_wal_path, number, xlog_segment_size)View on GitHub#

Get the next WAL files in sequence that are ready for prefetched archival.

Computes the names of the next number WAL files after requested_wal_path in the WAL sequence, then checks each for a corresponding .ready marker in pg_wal/archive_status. Discovery stops at the first WAL in the sequence that does not have a .ready file — a gap in readiness means later WALs are unlikely to be ready either, and Postgres will request them soon anyway.

Only actual WAL segment files are returned; .history and .backup files are not prefetched.

Parameters:
  • requested_wal_path (str) – Full path to the WAL file being archived in the main process i.e. the WAL file requested by Postgres’s archive_command.

  • number (int) – Maximum number of WAL files to return for prefetching.

  • xlog_segment_size (int) – The WAL segment size in bytes, used to compute correct segment boundaries when generating the sequence.

Returns:

List of WAL file paths ready for prefetching, in sequence order, up to number entries.

Return type:

list[str]

_is_already_archived(wal_path)View on GitHub#

Check if a WAL file has already been archived based on the last-archived cache.

Compares the WAL name against the cached last-archived WAL. Since WAL file names are lexicographically ordered by LSN, any WAL name <= the cached name has already been archived.

Parameters:

wal_path (str) – Full path to the WAL file.

Returns:

True if the WAL was already archived, False otherwise.

Return type:

bool

_prefetch_worker(wal_info)View on GitHub#

Worker function that archives a WAL file in a subprocess.

Invoked as the target of a WalPrefetchWorker process. Archives the WAL file and exits with code 0 on success or 1 on failure. Exceptions are logged but not propagated, since the parent process checks success via the worker’s exit code.

Parameters:

wal_info (barman.infofile.WalFileInfo) – Metadata for the WAL file to archive.

_read_cloud_wal_last_archived()View on GitHub#

Read the name of the last archived WAL file from the cache file.

Returns:

The name of the last archived WAL file, or None if not available.

Return type:

str|None

_update_metadata(wal_info, prefetch_workers)View on GitHub#

Update archival metadata on xlogdb and update the last-archived cache.

Appends a line to xlogdb for each successfully archived WAL (in order). The last WAL written to xlogdb is also written to the last-archived cache file.

It stops writing to xlogdb at the first failure found, as to avoid having any holes in xlogdb. E.g. if WAL files A, B, C and D are for archival, but C fails to be archived, only A and B are written to xlogdb (even though D might have been successfully archived). This maintains the last-archived cache consistent. If we were to also write D to xlogdb, the last-archived cache would be updated with D, and thus the next attempt of C by Postgres would mistakenly consider it as already archived.

Parameters:
_write_cloud_wal_last_archived(wal_name)View on GitHub#

Write the name of the last archived WAL file to the cache file atomically.

Uses a write-to-temporary-file-then-rename pattern to ensure the cache file is never partially written. An incomplete write followed by a crash would leave a corrupt cache file that could cause future WALs to be incorrectly skipped.

Parameters:

wal_name (str) – The name of the last archived WAL file.

archive(wal_path, parallel=0)View on GitHub#

Archive a WAL file to cloud storage.

When parallel > 1, the WAL files that immediately follow the requested one in the WAL sequence are checked for readiness (via .ready files in pg_wal/archive_status). Those that are ready are eligible for prefetching. Up to parallel - 1 extra WAL files are prefetched and uploaded concurrently in background worker processes.

The requested WAL is always archived first. Prefetch workers are only started after the primary WAL has been successfully archived. xlogdb is updated for each successfully archived WAL. The last-archived cache is also updated at the end with the last WAL written to xlog.db as to enable skipping already-archived files on subsequent invocations.

Parameters:
  • wal_path (str) – Full path to the WAL file to archive, as requested by Postgres’s archive_command.

  • parallel (int) – Total number of WAL files to archive in parallel. 0 or 1 disables prefetching (only the requested WAL is archived). When > 1, up to parallel - 1 extra WALs are prefetched and archived concurrently.

property last_archived_cache_pathView on GitHub#

Return the full path to the cache file that stores the last archived WAL name.

Returns:

Absolute path to the cache file.

Return type:

str

class barman.wal_archiver.CloudWalStorageStrategy(backup_manager, server)View on GitHub#

Bases: WalStorageStrategy

WAL storage strategy for cloud storage.

__init__(backup_manager, server)View on GitHub#

Constructor.

Parameters:
  • backup_manager (barman.backup_manager.BackupManager) – The backup manager

  • server (barman.server.Server) – The server

_abc_impl = <_abc._abc_data object>#
_build_wal_object_key(wal_name, compression)View on GitHub#

Build the full cloud object key for a WAL file.

Parameters:
  • wal_name (str) – the WAL file name (e.g. 000000010000000000000001)

  • compression (str|None) – the compression algorithm name used for the WAL

Returns:

the full cloud object key, including compression extension if applicable

Return type:

str

_check_duplicate(wal_info, object_key)View on GitHub#

Check if the cloud object object_key is identical to the source file.

If the WAL was compressed, the decompress parameter of download_file is used to decompress the cloud object before comparison.

Parameters:
  • wal_info (WalFileInfo) – the WAL file info object

  • object_key (str) – the cloud storage object key

Raises:
_get_compression_extension(compression)View on GitHub#

Return the file extension for the given compression algorithm.

The compression value must be either None or a key present in compression_registry whose class defines an EXTENSION attribute. Passing an unrecognised algorithm or one without EXTENSION (e.g. pigz, custom) will raise KeyError or AttributeError respectively.

Parameters:

compression (str|None) – the compression algorithm name

Returns:

the file extension (e.g. “.gz”) or “” if no compression

Return type:

str

delete(wals_to_delete)View on GitHub#

Delete WAL files according to the configured destination.

Parameters:

wals_to_delete (dict[str, list[WalFileInfo]]) – A dictionary where key is the WAL directory name and value is a list of wal_info objects representing the WALs to be deleted in that directory.

Return list[str]:

a list of deleted WAL names.

exists(wal_full_path)View on GitHub#
get_full_path(wal_name)View on GitHub#

Construct the full cloud object key for a given WAL file name.

Note

This method uses the current compression configuration to determine the file extension. If the compression config has changed since the WAL was originally stored, this method will return a path that does not match the actual cloud object. Callers that need to handle WALs stored under a previous compression config should use bucket listing instead.

Parameters:

wal_name (str) – the WAL file name

Returns:

the full cloud object key

Return type:

str

save(compressor, encryption, wal_info, **kwargs)View on GitHub#

Effectively persist a WAL file according to the configured destination.

If a compressor is provided, the WAL file is compressed in-memory before upload and the cloud key includes the compression extension. The wal_info object is updated with the compression type and compressed size so that xlogdb records accurate metadata.

Parameters:
  • compressor – an InternalCompressor instance, or None if no compression is desired

  • encryption (None|Encryption) – the encryptor for the file (if any)

  • wal_info (WalFileInfo) – the WAL file is being processed

  • kwargs

    additional parameters for the storage strategy, if any:

    • ”skip_delete”: if True, the source file will not be deleted after a successful upload.

Raises:

CompressionException – if compressor is not an InternalCompressor instance

Note

Only InternalCompressor subclasses (gzip, bzip2, xz, zstd, lz4, snappy) are supported for cloud WAL storage. pigz and custom are not supported because they rely on external processes and cannot compress in-memory.

Note

Encryption is not yet supported for cloud WAL storage. The encryption parameter is kept for interface compatibility.

class barman.wal_archiver.FileWalArchiver(backup_manager)View on GitHub#

Bases: WalArchiver

Manager of file-based WAL archiving operations (aka ‘log shipping’).

__init__(backup_manager)View on GitHub#

Base class init method.

Parameters:
  • backup_manager – The backup manager

  • name – The name of this archiver

Returns:

_abc_impl = <_abc._abc_data object>#
check(check_strategy)View on GitHub#

Perform additional checks for FileWalArchiver - invoked by server.check_postgres

Parameters:

check_strategy (CheckStrategy) – the strategy for the management of the results of the various checks

fetch_remote_status()View on GitHub#

Returns the status of the FileWalArchiver.

This method does not raise any exception in case of errors, but set the missing values to None in the resulting dictionary.

Return type:

dict[str, None|str]

get_next_batch()View on GitHub#

Returns the next batch of WAL files that have been archived through a PostgreSQL’s ‘archive_command’ (in the ‘incoming’ directory)

Returns:

WalArchiverQueue: list of WAL files

status()View on GitHub#

Set additional status info - invoked by Server.status()

class barman.wal_archiver.LocalWalStorageStrategy(backup_manager, server)View on GitHub#

Bases: WalStorageStrategy

WAL storage strategy for local filesystem storage.

Note

For now, this class is also responsible for encrypting and compressing files before storing them, but in the future this responsibility should be moved elsewhere, desirably in a shared component, so that strategies’ only responsibility is to actually persist files.

__init__(backup_manager, server)View on GitHub#

Constructor.

Parameters:
  • backup_manager (barman.backup_manager.BackupManager) – The backup manager

  • server (barman.server.Server) – The server

_abc_impl = <_abc._abc_data object>#
_check_duplicate(src_file, dst_file, wal_info)View on GitHub#

Check if the destination WAL file already exists in local storage, and if so, whether it is identical to the source file.

Parameters:
  • src_file (str) – the source WAL file path

  • dst_file (str) – the destination WAL file path

  • wal_info (WalFileInfo) – the WAL file info object

Raises:
_compress_file(compressor, src_file, dst_dir, wal_info)View on GitHub#

Compress src_file to a “temp” file inside dst_dir and updates wal_info.

Note

The temporary compressed file will have the same name as the source file, with a “.compressed” suffix appended. It’s created on the assumption that the caller will later remove the file or rename/move it to its final destination.

Parameters:
  • compressor – the compressor to use

  • src_file (str) – the file to compress

  • dst_dir (str) – the directory where the compressed file will be created

  • wal_info (WalFileInfo) – the WAL file info object

Returns:

the path to the compressed temporary file

_copy_stats(src_file, dst_file, wal_info)View on GitHub#

Copy stats from src_file to dst_file and updates its wal_info.

This is used preserve the metadata of the original file after compression or encryption, updating only the size in wal_info accordingly.

Parameters:
  • src_file (str) – the source WAL file path

  • dst_file (str) – the destination WAL file path

  • wal_info (WalFileInfo) – the WAL file info object

_delete_wal_directory(wal_dir, wal_list)View on GitHub#
_delete_wal_file(wal_info)View on GitHub#

Perform the actual deletion of the WAL file from local storage.

Parameters:

wal_info (WalFileInfo) – the WAL file info object

_encrypt_file(encryption, src_file, dst_dir, wal_info)View on GitHub#

Encrypt src_file to a “temp” file inside dst_dir and updates wal_info.

Note

The temporary encrypted file will have the same name as the source file, with a “.gpg” suffix appended. It’s created on the assumption that the caller will later remove the file or rename/move it to its final destination.

Parameters:
  • encryption – the encryptor to use

  • src_file (str) – the file to encrypt

  • dst_dir (str) – the directory where the encrypted file will be created

  • wal_info (WalFileInfo) – the WAL file info object

Returns:

the path to the encrypted temporary file

_fsync_contents(src_dir, dst_dir, dst_file)View on GitHub#

Fsync the contents of source and destination directories and the destination file.

Parameters:
  • src_dir (str) – the source directory path

  • dst_dir (str) – the destination directory path

  • dst_file (str) – the destination file path

_remove_intermediary_files()View on GitHub#

Remove any intermediary files created during the archival process

_rename_or_copy_file(src_file, dst_file)View on GitHub#

Rename or copy src_file to dst_file.

Rename is attempted first, and if it fails (because the source and destination are on different filesystems), a copy is performed.

Parameters:
  • src_file (str) – the source file path

  • dst_file (str) – the destination file path

delete(wals_to_delete)View on GitHub#

Delete WAL files according to the configured destination.

Parameters:

wals_to_delete (dict[str, list[WalFileInfo]]) – A dictionary where key is the WAL directory name and value is a list of wal_info objects representing the WALs to be deleted in that directory.

Return list[str]:

a list of deleted WAL names.

exists(wal_full_path)View on GitHub#
get_full_path(wal_name)View on GitHub#
save(compressor, encryption, wal_info, **kwargs)View on GitHub#

Effectively persist a WAL file according to the configured destination.

Compression and encryption are applied, if requested.

Parameters:
  • compressor – the compressor for the file (if any)

  • encryption (None|Encryption) – the encryptor for the file (if any)

  • wal_info (WalFileInfo) – the WAL file is being processed

  • kwargs – additional parameters for the storage strategy, if any

Raises:
class barman.wal_archiver.StreamingWalArchiver(backup_manager)View on GitHub#

Bases: WalArchiver

Object used for the management of streaming WAL archive operation.

__init__(backup_manager)View on GitHub#

Base class init method.

Parameters:
  • backup_manager – The backup manager

  • name – The name of this archiver

Returns:

_abc_impl = <_abc._abc_data object>#
_is_synchronous()View on GitHub#

Check if receive-wal process is eligible for synchronous replication

The receive-wal process is eligible for synchronous replication if synchronous_standby_names is configured and contains the value of streaming_archiver_name

Return type:

bool

_reset_streaming_status(postgres_status, streaming_status)View on GitHub#

Reset the status of receive-wal by removing the .partial file that is marking the current position and creating one that is current with the PostgreSQL insert location

_truncate_partial_file_if_needed(xlog_segment_size)View on GitHub#

Truncate .partial WAL file if size is not 0 or xlog_segment_size

Parameters:

xlog_segment_size (int)

check(check_strategy)View on GitHub#

Perform additional checks for StreamingWalArchiver - invoked by server.check_postgres

Parameters:

check_strategy (CheckStrategy) – the strategy for the management of the results of the various checks

fetch_remote_status()View on GitHub#

Execute checks for replication-based wal archiving

This method does not raise any exception in case of errors, but set the missing values to None in the resulting dictionary.

Return type:

dict[str, None|str]

get_next_batch()View on GitHub#

Returns the next batch of WAL files that have been archived via streaming replication (in the ‘streaming’ directory)

This method always leaves one file in the “streaming” directory, because the ‘pg_receivexlog’ process needs at least one file to detect the current streaming position after a restart.

Returns:

WalArchiverQueue: list of WAL files

receive_wal(reset=False)View on GitHub#

Creates a PgReceiveXlog object and issues the pg_receivexlog command for a specific server

Parameters:

reset (bool) – When set reset the status of receive-wal

Raises:

ArchiverFailure – when something goes wrong

status()View on GitHub#

Set additional status info - invoked by Server.status()

class barman.wal_archiver.WalArchiver(backup_manager, name)View on GitHub#

Bases: RemoteStatusMixin

Base class for WAL archiver objects

__init__(backup_manager, name)View on GitHub#

Base class init method.

Parameters:
  • backup_manager – The backup manager

  • name – The name of this archiver

Returns:

_abc_impl = <_abc._abc_data object>#
archive(verbose=True)View on GitHub#

Archive WAL files, discarding duplicates or those that are not valid.

Parameters:

verbose (boolean) – Flag for verbose output

abstractmethod check(check_strategy)View on GitHub#

Perform specific checks for the archiver - invoked by server.check_postgres

Parameters:

check_strategy (CheckStrategy) – the strategy for the management of the results of the various checks

abstractmethod get_next_batch()View on GitHub#

Return a WalArchiverQueue containing the WAL files to be archived.

Return type:

WalArchiverQueue

receive_wal(reset=False)View on GitHub#

Manage reception of WAL files. Does nothing by default. Some archiver classes, like the StreamingWalArchiver, have a full implementation.

Parameters:

reset (bool) – When set, resets the status of receive-wal

Raises:

ArchiverFailure – when something goes wrong

abstractmethod status()View on GitHub#

Set additional status info - invoked by Server.status()

static summarise_error_files(error_files)View on GitHub#

Summarise a error files list

Parameters:

error_files (list[str]) – Error files list to summarise

Return str:

A summary, None if there are no error files

class barman.wal_archiver.WalArchiverQueue(items, errors=None, skip=None, batch_size=0, total_size=0)View on GitHub#

Bases: list

__init__(items, errors=None, skip=None, batch_size=0, total_size=0)View on GitHub#

A WalArchiverQueue is a list of WalFileInfo which has two extra attribute list:

  • errors: containing a list of unrecognized files

  • skip: containing a list of skipped files.

It also stores batch run size information in case it is requested by configuration, in order to limit the number of WAL files that are processed in a single run of the archive-wal command.

Note

This class was originally designed to hold all the WAL files pending to be archived, and to process up to a certain number of files (the batch size). However, it is now used to hold only the files that are being processed in a single run of the archive-wal command. That change was made to avoid the overhead of having to create lots of WalFileInfo objects, even if only a subset of them would be used by the archiver in a given batch. We might want to rework or remove this class in the future, as it doesn’t seem to add much value to the archiving process anymore.

Parameters:
  • items – iterable from which initialize the list

  • errors – an optional list of unrecognized files

  • skip – an optional list of skipped files

  • batch_size – size of the current batch run (0=unlimited)

  • total_size – the total number of WAL files available for archiving.

class barman.wal_archiver.WalPrefetchWorker(wal_info, *args, **kwargs)View on GitHub#

Bases: Process

Custom worker class used to prefetch WAL files in parallel.

It is a simple wrapper around multiprocessing.Process that holds the WAL info to archive and provides a success property to check if the process was successful after finished.

__init__(wal_info, *args, **kwargs)View on GitHub#
property successView on GitHub#
class barman.wal_archiver.WalStorageStrategy(backup_manager, server)View on GitHub#

Bases: object

Abstract base class for WAL storage strategies.

WAL storage strategies are used to effectively store WAL files in the connfigured destination, be it local filesystem or cloud storage.

__init__(backup_manager, server)View on GitHub#

Constructor.

Parameters:
  • backup_manager (barman.backup_manager.BackupManager) – The backup manager

  • server (barman.server.Server) – The server

_abc_impl = <_abc._abc_data object>#
_run_post_archive_scripts(wal_info, dest_file, error)View on GitHub#

Run the post-archive scripts.

Parameters:
  • wal_info (WalFileInfo) – the WAL file info object

  • dest_file (str) – the destination WAL file path

  • error (Exception|None) – the exception raised during the archival process, if any

_run_post_delete_wal_scripts(wal_info, error=None)View on GitHub#

Run the post-delete hook-scripts, if any, on the given WAL.

Parameters:
_run_pre_archive_scripts(wal_info, src_file)View on GitHub#

Run the pre-archive scripts.

Parameters:
  • wal_info (WalFileInfo) – the WAL file info object

  • src_file (str) – the source WAL file path

_run_pre_delete_wal_scripts(wal_info)View on GitHub#

Run the pre-delete hook-scripts, if any, on the given WAL.

Parameters:

wal_info (barman.infofile.WalFileInfo) – WAL to run the script on.

abstractmethod delete(wals_to_delete)View on GitHub#

Delete WAL files according to the configured destination.

Parameters:

wals_to_delete (dict[str, list[WalFileInfo]]) – A dictionary where key is the WAL directory name and value is a list of wal_info objects representing the WALs to be deleted in that directory.

Return list[str]:

a list of deleted WAL names.

abstractmethod save(compressor, encryption, wal_info, **kwargs)View on GitHub#

Effectively persist a WAL file according to the configured destination.

Parameters:
  • compressor – the compressor for the file (if any)

  • encryption (None|Encryption) – the encryptor for the file (if any)

  • wal_info (WalFileInfo) – the WAL file is being processed

  • kwargs – additional parameters for the storage strategy, if any