Service and client encryption details

This section is fairly technical and serves as a walk through of how encryption is used in the product and is intended for somebody well versed with basic encryption primitives. For the uninitiated you might do well with just knowing that all data is encrypted on the client using a combination of state of the art post quantum cryptography and eliptic curve crypto. At no point is your data or your unencrypted private keys ever uploaded to either the service or your backup destinations.

What cryptographic primitives are used by the service?

The application uses several different cryptographic primitives. Master passwords are hashed using the Argon2 algorithm. This is true both for service passwords as well as backup passwords. In addition, the service extensively uses both SHA256 as well as SHA3 hashes during the backup.

The services use 2 separate asymmetric encryption algorithms for encrypting customer data. Before version 3 of the service only used a X25519 key exchange (X25519). Version 3 and later also adds support for using a Kyber key encapsulation mechanism (Kyber) which is a Quantum safe algorithm. Kyber is always used in the Kyber-1024 variant.

For symmetrical encryption, AES-256 is used universally using GCM encoding which also includes signing of the encrypted data.

How is the master password used?

The backup master password is hashed using Argon2 where the resulting 256-bit hash is then combined with a 256-bit random nonce using XOR. The resulting 256 bit value is used as the key for encrypting any private key material in the backup key. Each encrypted value is stored prepended with a unique 96-bit Initialization Vector (IV).

How are backup blocks encrypted?

You can choose how you wish to encrypt your data. The first option is to not encrypt data at all, secondly, you can encrypt it using X25519 key exchange only, and finally by default you can encrypt it using both X25519 and Kyber.

The way each key is generated is that we generate a new key pair for the respective encryption method. This key is then combined with the stored public key for the backup to create a unique 256-bit key for each block. The newly created keys private key is then discarded, but the public key is stored with the metadata of the block. This means that the key can now only be restored by combining it with the private key of the backup instead of the public key used to store the data. When using multiple encryption methods each method will generate a 256-bit key which will be XOR:ed together to generate a combined key.

Once the encryption key is created the behavior changes depending on if you have set the crossSourceDedupe property to false or not. If you have not done so an SHA3 hash of the plaintext block will be created and an XOR key mask will be stored with the block metadata that allows you to get to the SHA3 value of the block from the encryption key of the previous step. The block itself is then encrypted using AES256 with an IV of 0. This is safe because every single block is encrypted using a different encryption key (Otherwise AES-256 with GCM encoding has issues when encrypting different data using the same key and IV).

If the crossSourceDedupe property is set to false a random 256-bit set of key data is generated which is then combined with the combined key. The block is then encrypted using this combined key and using a random IV for each block. The IV is then stored with the encrypted blob.

Deduplication is done at the block level for large files. Blocks are labeled by their SHA256 hash. However, to ensure that the same data does not have the same block label on different backups each backup source generates a 256-bit block salt which is just a fixed nonce that is appended to any blocks contents when calculating the block hash. This salt is also stored with the encryption key.

How are backup logs and configuration encrypted?

Backup logs and configuration data is encrypted in the same way as when you are encrypting backup blocks with the exception that no key data is used to modify the combined key. So with these keys, the encryption key is always the same as the combined key. The advantage with this approach is that you do not need any additional metadata apart from your private keys to decrypt this kind of file. However, you also need to rewrite all this data when you generate new private keys while as your backup blocks can be re-keyed by only rewriting the metadata and updating the key mask for each block. This is why when you regenerate your private keys the logs have to be rewritten, but your backed-up data does not need to be updated.

How are small files encrypted?

Large files are generally stored in 1 or more blocks by themselves. However, for small files (Less than 8 MB in size by default) there is a more efficient way of storing the data where many files can be stored in a single block. However because you might want to share some of the smaller files in one block, but not some other small files are additionally encrypted where the contents of each file inside the block are encrypted using the SHA-256 hash of its contents using AES-256 with CBC encoding and with an empty IV (Again safe since each key is a unique SHA256 hash). This way the block can be shared along with the individual SHA256 hash of its contained files you wish to share (And any file in the block you do not wish to share you can render unreadable by simply not sharing the needed decryption key).

How are encryption keys stored?

Depending on where the key is stored different parts of it are encrypted or in plain text and sometimes not even included as shown in the table below.

Key content On host Source definition Source storage Share key
Argon 2 hash Included Included Included Not included
Public keys Plaintext Encrypted Encrypted Encrypted
Private keys Encrypted Encrypted Encrypted Encrypted
Sharing key Included Not included Included Not applicable
Additional keys Included Not included Included Not applicable
Block salt Plaintext Encrypted Encrypted Not applicable

The additional keys and sharing keys are additional sets of encryption keys that are used when sharing and will be described in detail in a separate section below.

As we always need to specify the backup master password to restore data everything can be encrypted whenever the key metadata is stored remotely. However, we do need access to the public keys as well as the block salt when we are running a backup without having access to the private keys.

How does sharing work?

The way sharing works in the general sense is that the sharing client needs the public keys of a key pair and the client receiving the share has the private key. When doing this without the Underscore Backup service the public key would need to be shared out of band. When a share is activated all existing blocks are gone through and their encryption key is determined (Using the master backup password to decrypt the main private keys). And then as if a new block was being created we do the exact same thing as when we encrypt a block and storing the public keys of the newly generated and discarded key pairs in the new block. Critically though the key mask is then created as the XOR between the original encryption key to the new key generated. This new block is then stored in the log needed to access the share. All the logs needed to create the share is generally encrypted as if this was a normal backup using the share public key.

How are keys exchanged when sharing through the Underscore Backup Service?

When using the Underscore Backup service every source will upload a set of public keys for every client. These are stored in the service and when a share is created these keys can be downloaded by the sharer. The sharer then uses these public keys to encrypt the private key to access the share for each of the public keys provided. These are then sent to the service. Each client of the recipient can then download the private key encrypted using its public key from the service and from that decrypt the private keys required to access the shares. The private keys are each encrypted as if they were a configuration file in the service (See above for details).