Muokkaa

Jaa


SQL Server backup to URL for S3-compatible object storage

Applies to: SQL Server 2022 (16.x)

This article introduces the concepts, requirements, and components necessary to use S3-compatible object storage as a backup destination. The backup and restore functionality is conceptually similar to working with SQL Server backup to URL for Azure Blob Storage as a backup device type.

For information on supported platforms, see providers of S3-compatible object storage.

Overview

SQL Server 2022 (16.x) introduces object storage integration to the data platform, enabling you to integrate SQL Server with S3-compatible object storage in addition to Azure Storage. To provide this integration, SQL Server supports an S3 connector, which uses the S3 REST API to connect to any provider of S3-compatible object storage. SQL Server 2022 (16.x) extends the existing BACKUP/RESTORE TO/FROM URL syntax by adding support for the new S3 connector using the REST API.

URLs pointing to S3-compatible resources are prefixed with s3:// to denote that the S3 connector is being used. URLs beginning with s3:// always assume that the underlying protocol is https.

Part numbers and file size limitations

To store data, the S3-compatible object storage provider must split files in multiple blocks called parts, similar to block blobs in Azure Blob Storage.

Each file can be split up to 10,000 parts, each part size ranges from 5 MB to 20 MB, this range is controlled by the T-SQL BACKUP command through the parameter MAXTRANSFERSIZE. The default value of MAXTRANSFERSIZE is 10 MB, therefore the default size of each part is 10 MB.

The maximum supported size of a single file is the result of 10,000 parts * MAXTRANSFERSIZE, if it is required to backup a bigger file it must split/striped up to 64 URLs. The final maximum supported size of a file is 10,000 parts * MAXTRANSFERSIZE * URLs.

Note

The use of COMPRESSION is required in order to change MAXTRANSFERSIZE values.

Prerequisites for the S3 endpoint

The S3 endpoint must be configured as follows:

  • TLS must be configured. It is assumed that all connections will be securely transmitted over HTTPS not HTTP. The endpoint is validated by a certificate installed on the SQL Server OS Host.
  • Credentials created on the S3-compatible object storage with proper permissions to perform the operation. The user and password created on the storage layer are named the Access Key ID and Secret Key ID. You need both to authenticate against the S3 endpoint.
  • At least one bucket has been configured. Buckets cannot be created or configured from SQL Server 2022 (16.x).

Security

Backup permissions

To connect SQL Server to S3-compatible object storage, two sets of permissions need to be established, one on SQL Server and also on the storage layer.

On SQL Server the user account that is used to issue BACKUP or RESTORE commands should be in the db_backupoperator database role with Alter any credential permissions.

On the storage layer:

  • In AWS S3, create a custom role and specifically state which S3 API requires access. Back up and restore requires these permissions: ListBucket (Browse), PutObject (Write - for backup).
  • In other S3-compatible storage, the user (Access Key ID) must have both ListBucket and WriteOnly permissions.

Restore permissions

If the database being restored does not exist, the user must have CREATE DATABASE permissions to be able to execute RESTORE. If the database exists, RESTORE permissions default to members of the sysadmin and dbcreator fixed server roles and the owner (dbo) of the database.

RESTORE permissions are given to roles in which membership information is always readily available to the server. Because fixed database role membership can be checked only when the database is accessible and undamaged, which is not always the case when RESTORE is executed, members of the db_owner fixed database role do not have RESTORE permissions.

On the storage layer:

  • In AWS S3, create a custom role and specifically state which S3 API requires access. Back up and restore requires these permissions: ListBucket (Browse), GetObject (Read - for restore).
  • In other S3-compatible storage, the user (Access Key ID) must have both ListBucket and ReadOnly permissions.

Supported features

High-level overview of the supported features for BACKUP and RESTORE:

  1. A single backup file can be up to 200,000 MiB per URL (with MAXTRANSFERSIZE set to 20 MB).
  2. Backups can be striped across a maximum of 64 URLs.
  3. Mirroring is supported, but only across URLs. Mirroring using both URL and DISK is not supported.
  4. Compression is supported and recommended.
  5. Encryption is supported.
  6. Restore from URL with S3-compatible object storage has no size limitation.
  7. When you are restoring a database, the MAXTRANSFERSIZE is determined by value assigned during the backup phase.
  8. URLs can be specified either in virtual host or path style format.
  9. WITH CREDENTIAL is supported.
  10. REGION is supported and the default value is us-east-1.
  11. MAXTRANSFERSIZE ranges from 5 MB to 20 MB. 10 MB is the default value for the S3 connector.

Supported arguments for backup

WITH options S3 Endpoint Notes
BLOCKSIZE Y MAXTRANSFERSIZE determines the Part size.
BUFFERCOUNT Y
COMPRESSION Y
COPY_ONLY Y
CREDENTIAL Y
DESCRIPTION Y
DIFFERENTIAL Y
ENCRYPTION Y
FILE_SNAPSHOT N
MAXTRANSFERSIZE Y From 5 MB (5,242,880 Bytes) to 20 MB (20,971,520 Bytes), default value is 10 MB (10,485,760 Bytes).
MEDIADESCRIPTION Y
MEDIANAME Y
MIRROR TO Y Only works with another URL, MIRROR with URL and DISK is not supported.
NAME Y
NOFORMAT / FORMAT Y
NOINIT / INIT N Appending is not supported. To overwrite a backup, use WITH FORMAT.
NO_CHECKSUM / CHECKSUM Y
NO_TRUNCATE Y
REGION Y Default value is us-east-1. Must be used with BACKUP_OPTIONS.
STATS Y

Supported arguments for restore

WITH options S3 Endpoint Notes
BLOCKSIZE Y MAXTRANSFERSIZE determines the Part size.
BUFFERCOUNT N
CHECKSUM / NO_CHECKSUM Y
CREDENTIAL Y
ENABLE_BROKER / ERROR_BROKER_CONVERSATIONS / NEW_BROKER Y
FILE N Logical names not supported with RESTORE FROM URL.
FILESTREAM Y
KEEP_CDC Y
KEEP_REPLICATION Y
LOADHISTORY Y
MAXTRANSFERSIZE Y
MEDIANAME Y
MEDIAPASSWORD N Required for some backups taken in versions before SQL Server 2012.
MOVE Y
PARTIAL Y
PASSWORD N Required for some backups taken in versions before SQL Server 2012.
RECOVERY / NORECOVERY / STANDBY Y
REGION Y Default value is us-east-1. Must be used with RESTORE_OPTIONS.
REPLACE Y
RESTART Y
RESTRICTED_USER Y
REWIND / NOREWIND N
STATS Y
STOP_ON_ERROR / CONTINUE_AFTER_ERROR Y
STOPAT / STOPATMARK / STOPBEFOREMARK Y
UNLOAD / NOUNLOAD N

Region

Your S3-compatible object storage provider can offer the ability to determine a specific region for the bucket location. The use of this optional parameter can provide more flexibility by specifying which region that particular bucket belongs to. This parameter requires the use of WITH together with either BACKUP_OPTIONS or RESTORE_OPTIONS. These options require the value to be declared in JSON format. This allows scenarios in which an S3-compatible storage provider can have the same universal URL but be distributed across several regions. In this case, the backup or restore command point to the specified regions without the need to change the URL.

If no value is declared, us-east-1 is assigned as default.

Backup example:

WITH BACKUP_OPTIONS = '{"s3": {"region":"us-west-1"}}'

Restore example:

WITH RESTORE_OPTIONS = '{"s3": {"region":"us-west-1"}}'

Linux support

SQL Server uses WinHttp to implement client of HTTP REST APIs it uses. It relies on OS certificate store for validations of the TLS certificates presented by the http(s) endpoint. However, SQL Server on Linux the CA must be placed on a predefined location to be created at /var/opt/mssql/security/ca-certificates, only the first 50 certificates can be stored and supported in this folder. The CA must be in place before SQL Server process is started.

SQL Server reads the certificates from the folder during startup and adds them to the trust store.

Only super user should be able to write in the folder, while the mssql user must be able to read.

Unsupported features

  • Backup to S3-compatible object storage with a nonsecure http URL is not supported. Customers are responsible for setting up their S3 host with an https URL and this endpoint is validated by a certificate installed on the SQL Server OS host.
  • Backup to S3-compatible object storage is not supported in SQL Server Express and SQL Server Express with Advanced Services editions.

Limitations

The following are the current limitations of backup and restore with S3-compatible object storage:

  • Due to the current limitation of S3 Standard REST API, the temporary uncommitted data files that are created in the customer's S3-compatible object store (due to an ongoing multipart upload operation) while the BACKUP T-SQL command is running, are not removed in case of failures. These uncommitted data blocks continue to persist in S3-compatible object storage in the case the BACKUP T-SQL command fails or is canceled. If the backup succeeds, these temporary files are automatically removed by the object store to form the final backup file. Some S3-compatible storage providers handle temporary files through their garbage collector system.
  • The total URL length is limited to 259 characters. The full string is counted in this limitation, including the s3:// connector name. So, the usable limit is 254 characters. However, we recommend sticking to a limit of 200 characters to allow for possible introduction of query parameters.
  • The SQL credential name is limited by 128 characters in UTF-16 format.
  • Secret key ID must not have : character.

Path style and virtual host style

Backup to S3 supports the URL to be written in both path style or virtual host style.

Path style example: s3://<endpoint>:<port>/<bucket>/<backup_file_name>

Virtual host example: s3://<bucket>.<domain>/<backup_file_name>

Examples

Create credential

  • The name of the credential should provide the storage path, and there are multiple standards for this depending on the storage platform.
  • The IDENTITY should always be 'S3 Access Key' when using the S3 connector.
  • The Access Key ID and Secret Key ID must not contain a colon. Access Key ID and Secret Key ID is the user and password created on the S3-compatible object storage.
  • Only alphanumeric values are allowed.
  • The Access Key ID must have proper permissions on the S3-compatible object storage.

Use CREATE CREDENTIAL to create a server level credential for authentication with the S3-compatible object storage endpoint.

USE [master];
CREATE CREDENTIAL [s3://<endpoint>:<port>/<bucket>]
WITH
        IDENTITY    = 'S3 Access Key',
        SECRET      = '<AccessKeyID>:<SecretKeyID>';
GO

BACKUP DATABASE [SQLTestDB]
TO      URL = 's3://<endpoint>:<port>/<bucket>/SQLTestDB.bak'
WITH    FORMAT /* overwrite any existing backup sets */
,       STATS = 10
,       COMPRESSION;

However, AWS S3 supports two different standards of URL.

  • S3://<BUCKET_NAME>.S3.<REGION>.AMAZONAWS.COM/<FOLDER> (default)
  • S3://S3.<REGION>.AMAZONAWS.COM/<BUCKET_NAME>/<FOLDER>

There are multiple approaches to successfully creating a credential for AWS S3.

-- S3 bucket name: datavirtualizationsample
-- S3 bucket region: us-west-2
-- S3 bucket folder: backup

CREATE CREDENTIAL [s3://datavirtualizationsample.s3.us-west-2.amazonaws.com/backup]
WITH    
        IDENTITY    = 'S3 Access Key'
,       SECRET      = 'accesskey:secretkey';
GO

BACKUP DATABASE [AdventureWorks2022]
TO URL  = 's3://datavirtualizationsample.s3.us-west-2.amazonaws.com/backup/AdventureWorks2022.bak'
WITH COMPRESSION, FORMAT, MAXTRANSFERSIZE = 20971520;
GO

Or,

CREATE CREDENTIAL [s3://s3.us-west-2.amazonaws.com/datavirtualizationsample/backup]
WITH    
        IDENTITY    = 'S3 Access Key'
,       SECRET      = 'accesskey:secretkey';
GO

BACKUP DATABASE [AdventureWorks2022]
TO URL  = 's3://s3.us-west-2.amazonaws.com/datavirtualizationsample/backup/AdventureWorks2022.bak'
WITH COMPRESSION, FORMAT, MAXTRANSFERSIZE = 20971520;
GO

Backup to URL

The following example performs a full database backup to the object storage endpoint, striped across multiple files:

BACKUP DATABASE <db_name>
TO      URL = 's3://<endpoint>:<port>/<bucket>/<database>_01.bak'
,       URL = 's3://<endpoint>:<port>/<bucket>/<database>_02.bak'
,       URL = 's3://<endpoint>:<port>/<bucket>/<database>_03.bak'
--
,       URL = 's3://<endpoint>:<port>/<bucket>/<database>_64.bak'
WITH    FORMAT -- overwrite
,       STATS               = 10
,       COMPRESSION;

Restore from URL

The following example performs a database restore from the object storage endpoint location:

RESTORE DATABASE <db_name>
FROM    URL = 's3://<endpoint>:<port>/<bucket>/<database>_01.bak'
,       URL = 's3://<endpoint>:<port>/<bucket>/<database>_02.bak'
,       URL = 's3://<endpoint>:<port>/<bucket>/<database>_03.bak'
--
,       URL = 's3://<endpoint>:<port>/<bucket>/<database>_64.bak'
WITH    REPLACE -- overwrite
,       STATS  = 10;

Options for encryption and compression

The following example shows how to back up and restore the AdventureWorks2022 database with encryption, MAXTRANSFERSIZE as 20 MB and compression:

CREATE MASTER KEY ENCRYPTION BY PASSWORD = <password>;
GO

CREATE CERTIFICATE AdventureWorks2022Cert
    WITH SUBJECT = 'AdventureWorks2022 Backup Certificate';
GO
-- Backup database
BACKUP DATABASE AdventureWorks2022
TO URL = 's3://<endpoint>:<port>/<bucket>/AdventureWorks2022_Encrypt.bak'
WITH FORMAT, MAXTRANSFERSIZE = 20971520, COMPRESSION,
ENCRYPTION (ALGORITHM = AES_256, SERVER CERTIFICATE = AdventureWorks2022Cert)
GO

-- Restore database
RESTORE DATABASE AdventureWorks2022
FROM URL = 's3://<endpoint>:<port>/<bucket>/AdventureWorks2022_Encrypt.bak'
WITH REPLACE

Use region for backup and restore

The following example shows how to back up and restore the AdventureWorks2022 database using REGION_OPTIONS:

You can parameterize the region within each BACKUP / RESTORE command. Note the S3-specific region string in the BACKUP_OPTIONS and RESTORE_OPTIONS, for example, '{"s3": {"region":"us-west-2"}}'. The default region is us-east-1. A simple example:

-- Backup Database
BACKUP DATABASE AdventureWorks2022
TO URL = 's3://<endpoint>:<port>/<bucket>/AdventureWorks2022.bak'
WITH BACKUP_OPTIONS = '{"s3": {"region":"us-west-2"}}'

-- Restore Database
RESTORE DATABASE AdventureWorks2022
FROM URL = 's3://<endpoint>:<port>/<bucket>/AdventureWorks2022.bak'
WITH 
  MOVE 'AdventureWorks2022' 
  TO 'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\AdventureWorks2022.mdf'
, MOVE 'AdventureWorks2022_log' 
  TO 'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\AdventureWorks2022.ldf'
, RESTORE_OPTIONS = '{"s3": {"region":"us-west-2"}}'

For example:

-- S3 bucket name: datavirtualizationsample
-- S3 bucket region: us-west-2
-- S3 bucket folder: backup

CREATE CREDENTIAL   [s3://datavirtualizationsample.s3.amazonaws.com/backup]
WITH    
        IDENTITY    = 'S3 Access Key'
,       SECRET      = 'accesskey:secretkey';
GO

BACKUP DATABASE [AdventureWorks2022]
TO URL  = 's3://datavirtualizationsample.s3.amazonaws.com/backup/AdventureWorks2022.bak'
WITH
    BACKUP_OPTIONS = '{"s3": {"region":"us-west-2"}}' -- REGION AS PARAMETER)
, COMPRESSION, FORMAT, MAXTRANSFERSIZE = 20971520;
GO

RESTORE DATABASE AdventureWorks2022_1 
FROM URL = 's3://datavirtualizationsample.s3.amazonaws.com/backup/AdventureWorks2022.bak'
WITH 
  MOVE 'AdventureWorks2022' 
  TO 'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\AdventureWorks2022_1.mdf'
, MOVE 'AdventureWorks2022_log' 
  TO 'C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\DATA\AdventureWorks2022_1.ldf'
, STATS = 10, RECOVERY
, REPLACE, RESTORE_OPTIONS = '{"s3": {"region":"us-west-2"}}'; -- REGION AS PARAMETER)
GO