Backup

Table of Contents

Introduction
General Considerations
Pure POSIX Setup
Distributed Setup

Introduction

Making a backup of the Infinite Scale instance is an important task that should be done on a regular basis. This ensures that in case of issues, data can be restored from a backup.

Though no backup strategies are covered here, reading this document can add valuable information when creating a backup and restore strategy.

General Considerations

Infinite Scale can run in two different setups when it comes to storing data:

All data, meaning configuration, blobs and metadata, is stored on POSIX-compliant filesystems.
Blobs are stored on an S3-compliant storage while configuration and metadata are stored on POSIX.

For details on supported filesystems, see: Filesystems and Shared Storage.

Depending on how this data is stored, backup strategies and tasks will differ, see the sections below.

You have to fully stop the Infinite Scale instance to create a consistent state necessary for performing the backup. In the following scenarios, this state of consistency is a prerequisite for all upcoming tasks. To avoid redundancy, this step will not be repeatedly written down every time. When backup has finished, Infinite Scale can be started again.

There are two general approaches to achieve a backup:

When the underlying filesystems of the storages used provide snapshot capabilities, creating the base for the backup will only take a few seconds. This is the preferred method as the Infinite Scale instance can be started again directly after creating the necessary snapshots. To prevent data loss in case of hardware failures, it is recommend to copy the snapshots to a secondary storage or use backup software of your choice that accesses the snapshots for further processing.
If snapshot capabilities are not available or can’t be used, backup software of your choice can be used to backup Infinite Scale.

See the Default Paths documentation for more information on where data is saved and how different locations can be defined.

Keep in mind that you always have to back up all data sets consistently:

config data
system data ¹
metadata ²
blobs ²

(1) … System data has, if not otherwise defined, a common root path which is the same root path as for metadata. Individual paths can be defined for particular services though. See the Base Data Directory for more details. Also see the search configuration example. If the search index does not get backed up, it needs to be recreated.

(2) … For POSIX-only environments, the location for blobs and metadata is the same. When using S3, the location differs. The backup process must include the metadata, for details see Filesystems and Shared Storage.

As a rule of thumb, consider also to back up the Infinite Scale binary or the container used. This avoids difficulties when restoring the instance and possibly using an updated Infinite Scale version compared to the version used when backing up. Restoring and updating should always be different tasks.

Pure POSIX Setup

If all data - configuration, blobs and metadata - are stored on POSIX-compliant filesystems, create a consistent state and back up the data sets. This is easiest if you have only one filesystem for all of them. If you have different filesystems for the config and blobs/metadata, you need to do this task for each of them.

Distributed Setup

If data is distributed with configuration and metadata stored on POSIX-compliant filesystems and blobs are stored on S3, create a consistent state and back up:

the configuration and metadata,
the S3 bucket according to the rules and possibilities of the S3 provider.