SwiftStack Data Protection Suite - Advanced Configuration

Overview

This is an advanced configuration guide for SwiftStack Data Protection middleware suite. SwiftStack Data Protection is designed to automatically guard against data loss due to accidental or malicious requests to delete or overwrite data.

For basic configuration overview please see SwiftStack Data Protection.

Protection is provided by retaining copies of deleted or overwritten objects in limited-access containers. To prevent unbounded cluster growth, these archive copies are kept for some configurable retention period.

Caveats

  • This imposes non-trivial overhead on all affected PUT and DELETE requests, as data must be moved within the cluster before the client request may be serviced.
  • Retention periods require careful consideration. If the retention period is too short, objects in archive containers may expire before administrators notice that they shouldn't have been deleted. If too long, the additional storage requirements imposed by archive containers may necessitate earlier-than-expected cluster expansion.
  • Even with an appropriate retention period, write-heavy primary containers may overwhelm the archive container.
  • This limits users' abilities to manage their own storage consumption. Avoid using Account Quotas and data protection on the same account.
  • You should thoroughly test new applications when working with data protection to ensure their behavior is expected. If issues are identified, you may wish to disable data protection for the account(s) used.

Note

In light of the above limitations and considerations, this feature is not available by default; please contact SwiftStack support if your use case requires this behavior.

Enabling

Three middlewares must be enabled for the cluster:

  • Data Protection
  • Defaulter
  • Versioned Writes

After enabling all three, push config to complete the change. All data in newly-created containers will be protected by default.

Pre-Existing Containers

Pre-existing containers will not be automatically protected; this is due to the difficulting in distinguishing between a pre-existing container and a container that was intentionally excluded from protection.

For a single container

With a super user auth token, enable history-based versioning on the container:

curl https://swift.host/v1/AUTH_user/$CONT -X POST \
  -H 'X-Auth-Token: AUTH_tk....' \
  -H "X-History-Location: .versions-$CONT"

The archive container will be automatically created as needed. If using a different auto_enable_prefix, the X-History-Location header should reflect that.

For an entire account

Python simplifies the process of listing containers, skipping archive containers, and checking whether versioning is already set up:

try:
    import urlparse
except ImportError:
    from urllib import parse as urlparse
import swiftclient

auth_url = 'https://swift.host/auth/v1.0'
admin_user = 'super_user'
admin_key = '...'
account = 'AUTH_user'

conn = swiftclient.Connection(auth_url, admin_user, admin_key)
url, token = conn.get_auth()
info = conn.get_capabilities()
if 'defaulter' not in info:
    exit('defaulter not enabled')
container_format = info['defaulter'].get(
    'default-container-x-history-location')
if not container_format:
    exit('defaulter missing default-container-x-history-location')
if not container_format.endswith('{container}'):
    exit('defaulter specified a non-prefix '
         'default-container-x-history-location')
container_prefix = container_format[:-len('{container}')]

url = urlparse.urljoin(url.lstrip('/'), account)
conn = swiftclient.Connection(auth_url, admin_user, admin_key,
                              5, url, token)
for container in conn.get_account(full_listing=True)[1]:
    if container['name'].startswith(container_prefix):
        continue
    meta = conn.head_container(container['name'])
    if 'x-versions-location' in meta or 'x-history-location' in meta:
        continue
    history_location = container_prefix + container['name']
    conn.post_container(container['name'],
                        {'X-History-Location': history_location})
For all accounts

The SwiftStack Controller utilization API may be used to identify all accounts within a cluster:

try:
    import urlparse
except ImportError:
    from urllib import parse as urlparse
import time
import requests
import swiftclient.service

controller_url = 'https://platform.swiftstack.com'
controller_user = 'swiftstack_controller_user'
controller_api_key = '...'
cluster_id = ...

auth_url = 'https://swift.host/auth/v1.0'
admin_user = 'super_user'
admin_key = '...'

def iter_accounts():
    utilization_path = ('/api/v1/clusters/%s/utilization/storage/0'
                        % cluster_id)
    start = time.strftime('%Y-%m-%dT%H:%M:%SZ',
                          time.gmtime(time.time() - 24 * 60 * 60))
    offset = 0
    while True:
        result = requests.get(
            urlparse.urljoin(controller_url, utilization_path),
            headers={
                'Authorization': 'apikey %s:%s' % (controller_user,
                                                   controller_api_key)},
            params={'start': start, 'offset': offset}).json()
        if not result['objects']:
            break
        for record in result['objects']:
            yield record['account']
        offset += len(result['objects'])

conn = swiftclient.Connection(auth_url, admin_user, admin_key)
url, token = conn.get_auth()
info = conn.get_capabilities()
if 'defaulter' not in info:
    exit('defaulter not enabled')
container_format = info['defaulter'].get(
    'default-container-x-history-location')
if not container_format:
    exit('defaulter missing default-container-x-history-location')
if not container_format.endswith('{container}'):
    exit('defaulter specified a non-prefix '
         'default-container-x-history-location')
container_prefix = container_format[:-len('{container}')]

for account in iter_accounts():
    url = urlparse.urljoin(url.lstrip('/'), account)
    conn = swiftclient.Connection(auth_url, admin_user, admin_key,
                                  5, url, token)
    for container in conn.get_account(full_listing=True)[1]:
        if container['name'].startswith(container_prefix):
            continue
        meta = conn.head_container(container['name'])
        if ('x-versions-location' in meta or
                'x-history-location' in meta):
            continue
        history_location = container_prefix + container['name']
        conn.post_container(container['name'],
                            {'X-History-Location': history_location})

Important

You will probably want to run this once immediately after enabling data protection, then again a few days later. This ensures protection for:

  • Recently-created accounts that had not reported utilization before the initial run
  • Recently-created containers that were not available in listings during the initial run

Retrieving Old Versions

Old versions of an object <object name> will appear in the archive container named <length><object name>/<timestamp>, where <length> is the hexadecimal representation of the length of <object name> and <timestamp> is the Unix timestamp that was that object's Last-Modified date when it was in the protected container.

To list old versions of an $OBJECT, do a prefix-listing of the $ARCHIVE_CONTAINER:

curl -H "X-Auth-Token: $OS_AUTH_TOKEN" \
  "$OS_STORAGE_URL/$ARCHIVE_CONTAINER?prefix=$(printf \
  '%03x%s/' ${#OBJECT} $OBJECT)"

Or with python:

cont, obj = ...
conn = swiftclient.Connection(auth_url, user, key)
cont_meta = conn.head_container(cont)
old_versions = conn.get_container(cont_meta['x-history-location'],
                                  prefix='%03x%s/' % (len(obj), obj))

See Swift's Object Versioning documentation for more information.

Configuration

Data protection uses a heirarchical configuration, allowing for fine-grained control over which accounts and containers are protected.

Cluster-Wide

The Data Protection middleware has three configuration options beyond simple enablement:

auto_enable_prefix
The prefix to be used for archive containers. When a new container container is created, it will automatically have versioning enabled with a X-History-Location: <prefix>container header. Ordinary users will not be able to create containers starting with this prefix.
owner_can_protect

Whether account owners can manage container protections. By default, only super users may:

  • create or delete containers beginning with the auto_enable_prefix;
  • delete, overwrite, or POST to objects in such a container; or
  • toggle versioning on any containers.
default_versions_retention
How long old versions of objects should be kept once moved to archive containers. When an archive container is first created, this will be used to set the X-Default-Object-X-Delete-After header for that container. A value of 0 will cause objects to be retained indefinitely.

Per-Account

To disable protection for an entire account, administrators may change the X-Undelete-Enabled setting for the account:

curl https://swift.host/v1/AUTH_user -X POST \
  -H 'X-Auth-Token: AUTH_tk....' -H 'X-Undelete-Enabled: false'

Note

There is currently no way to change the default retention period for an entire account.

This will immediately

  • lift the added restrictions on ordinary users' abilities to set X-Versions-Location and X-History-Location headers and
  • prevent X-History-Location from being automatically set on new containers.

Since users may still want to use X-History-Mode, however, pre-existing containers will still repect the currently-set X-History-Mode. Similar to enabling protection for an entire account, you can loop over all existing containers if you really want to disable all versioning within the account:

import swiftclient

auth_url = 'https://swift.host/auth/v1.0'
user = 'user'
key = '...'

conn = swiftclient.Connection(auth_url, user, key)
for container in conn.get_account(full_listing=True)[1]:
    meta = conn.head_container(container['name'])
    if 'x-versions-location' in meta or 'x-history-location' in meta:
        conn.post_container(container['name'],
                            {'X-Remove-History-Location': 'x',
                             'X-Remove-Versions-Location': 'x'})

This may be done as an ordinary user.

Per-Container

To disable data protection for a specific container, an administrator may remove the X-History-Location setting for the container:

curl https://swift.host/v1/AUTH_user/container -X POST \
  -H 'X-Auth-Token: AUTH_tk....' -H 'X-Remove-History-Location: true'

Any previously-protected archive container will continue to be protected, although the data in it will still expire according to the container's retention period.

To adjust the retention period for an archive container, an administrator should change the X-Default-Object-X-Delete-After setting for the archive container:

curl https://swift.host/v1/AUTH_user/.versions-container -X POST \
  -H 'X-Auth-Token: AUTH_tk....' \
  -H 'X-Default-Object-X-Delete-After: <new retention period, in seconds>'

Note

This affects newly-archived objects; existing versions will continue to expire according to their X-Delete-At timestamp.

Note

Unlike the cluster-wide setting, a retention period of 0 will immediately expire new versions. Use X-Remove-Default-Object-X-Delete-After: true to retain old versions indefinitely.

Per-Object

To adjust the retention period for an individual archived object, administrators may change or remove the X-Delete-At header for that object with a POST request. See the swift documentation for expiring objects for more details.

Note

The POST needs to include any object metadata that should be preserved. Alternatively, you may wish to COPY the object to itself while overriding the X-Delete-At/X-Delete-After headers.

Middleware Details

The data protection suite is composed of three related middlewares:

  • A new defaulter middleware, allowing users and administrators to specify headers that should be applied (if not already set by the client) during PUT requests,
  • A fork of versioned_writes which includes several in-review patches to:
    • add support for a history-based versioning mode (available upstream since Swift 2.10.0),
    • automatically create archive containers on demand if they don't already exist, and
    • respect defaulter settings when copying data to the archive container, to set X-Delete-After headers according to the retention period.
  • A new data-protection middleware, which:
    • limits write access to "protected" containers,
    • authorizes the automatic creation of archive containers even if the requesting user wouldn't have been allowed to create the container, and
    • restricts ordinary users' abilities to set certain headers, such as
      • X-Versions-Location
      • X-History-Location
      • X-Default-Container-X-Versions-Location
      • X-Default-Container-X-History-Location
      • X-Default-Object-X-Delete-At
      • X-Default-Object-X-Delete-After