Terminology

Adopting

The task of taking an existing backup Source and using it from a new installation of the client. Involved doing a Rebuild operation from the remote Manifest.

Block

A single piece of data is stored in a destination. A block is usually a few MB in size and can contain anything from a small part of a very large file to many small files.

When working with a backup, the system will always upload and download entire blocks from a backup destination.

Destination

A location where backup data is stored. A destination is usually defined by a type, a location, and a set of credentials.

Source

A source refers to a backup of a single system. Think of it as a specific installation of the Underscore Backup client running on a specific service. You usually start a restore by selecting which source you want to restore from.

Share

A share is a subset of a source that is shared with another Underscore Backup user. Shares use a different Private Key for encryption which means that a recipient of a share can only read the files in the shares that have been explicitly shared and not the entire source.

File

Refers to the collection of all the versions of files from a specific location on your system that has been stored so far in your backup.

File version

A specific version of a file representing the contents of the file at a given time.

Manifest 

Refers to both the local metadata stored for a backup where the client is running as well as its representation in a Destination used to recreate it from a backup. In the local representation, it contains your configuration, public encryption key as well as a database of your entire Source contents. The equivalent representation in the Destination contains the configuration, key but instead of a backup contains a log of all the changes that were made to the local manifest database to get it to its current state.

Operations

Sometimes when long-running tasks are being performed the progress is reported as Operations. Each operation just represents a single atomic step that needs to be performed before the entire task is completed and can mean many things and should only be seen to get progress as to when the overall task will be done.

Optimize log

The remote backup representation of a backup is a continuous log of all the changes made in a Source. If you have a backup running for a long that this log can become inefficient since it will contain information about everything that has existed that might have been deleted or changed. To solve this problem the system by default runs a scheduled maintenance task to optimize the log once a month to write the log of exactly what the backup contains right now which replaces the entire log of the backup up until that moment.

Private key

The key that is required to restore data from your backup. It is derived from your backup password. The public key is derived mathematically from this key.

Public key

The key required to write backup data. It is mathematically derived from the private key in a way where it is easy to derive the public key from the private key, but impossible to infer the private key from the public key.

Rebuild

The task of rebuilding the local manifest database from the log stored in the backup Destination for the manifest.

Retention

The settings for a backup set that define how many file versions should be retained and when old versions or files should be deleted from the backup.

Schedule

How often a certain operation should happen. Usually this refers to how often a backup set should be scanner for updated files.

Set

A collection of settings that define a group of files, with a schedule, retention, and destinations.

Trimming

The task of trimming refers to applying retention settings on your repository. It usually runs after the end of each completed backup set.

Retention explained

Retention in Underscore Backup is very flexible but can be somewhat tricky to understand. For each backup set, you can specify different retention. There is also a global setting for retention that is used for all files that are not contained in any current set. A file can end up existing in your backup but not being in a set after you change your definition of a set.

Retention is defined in four parts. First, is the default initial retention which is defined as the maximum number of versions of a file during any given time period. For instance, you could have your default retention be set to 15 minutes which means that even if a file is being changed every 5 minutes the backup would only contain one copy per 15 minutes. You can then progressively move to less and less frequent copies as the version grows older. There is also a setting to indicate how long files should be kept around after they have been deleted. Finally, there is a catch-all setting for how many versions at most to keep of a file.

To explain let's look at this example.

 

In this example, the application would keep one version of the file at most every 15 minutes. Once a version is older than 1 month only a single version per day is kept and all the other potential versions will be discarded. After a version becomes 6 months another culling will take place and only a single copy per month will be retained. Finally, the file will be kept for up to 1 month after it has been deleted from your system.

There is no maximum number of versions kept, if this was enabled at 10 it would be applied on top of the previous options so there would never be more than 10 versions of any single file.

Retention in combination with continuous backups

Continuous backups complicate retention settings because if you have a file that changes very often it would be very inefficient to replace the most recent version of a file every time it changes. To solve this when the continuous backup is enabled the system will only save a new file once during a retention period.

As an example, let's say that we have a file that updates once every minute for 20 minutes while the retention for the file is to keep a version every 15 minutes. What would happen then is that the first version would be saved immediately after the first change. After this, the next version will not be stored until after 15 minutes. The system will then keep track of that the file has changed more after this for another 5 minutes until the changes stop, but the latest version of the file will not be saved until 30 minutes after the beginning of this cycle (15 minutes after the last change).

How retention is applied

Retention is applied at the end of the completion of each set. If a backup is interrupted while still in progress to for instance do a restore this operation could see slightly more versions than what retention specifies since files have been stored, but any potential culling by adding them has not yet happened. If you are looking at the UI while a backup runs you will see a status of Trimming while the retention is happening.