SwiftStack Data Protection Suite - Advanced Configuration¶
Overview¶
This is an advanced configuration guide for SwiftStack Data Protection middleware suite. SwiftStack Data Protection is designed to automatically guard against data loss due to accidental or malicious requests to delete or overwrite data.
For basic configuration overview please see SwiftStack Data Protection.
Protection is provided by retaining copies of deleted or overwritten objects in limited-access containers. To prevent unbounded cluster growth, these archive copies are kept for some configurable retention period.
Caveats¶
- This imposes non-trivial overhead on all affected
PUT
andDELETE
requests, as data must be moved within the cluster before the client request may be serviced. - Retention periods require careful consideration. If the retention period is too short, objects in archive containers may expire before administrators notice that they shouldn't have been deleted. If too long, the additional storage requirements imposed by archive containers may necessitate earlier-than-expected cluster expansion.
- Even with an appropriate retention period, write-heavy primary containers may overwhelm the archive container.
- This limits users' abilities to manage their own storage consumption. Avoid using Account Quotas and data protection on the same account.
- You should thoroughly test new applications when working with data protection to ensure their behavior is expected. If issues are identified, you may wish to disable data protection for the account(s) used.
Note
In light of the above limitations and considerations, this feature is not available by default; please contact SwiftStack support if your use case requires this behavior.
Enabling¶
Three middlewares must be enabled for the cluster:
- Data Protection
- Defaulter
- Versioned Writes
After enabling all three, push config to complete the change. All data in newly-created containers will be protected by default.
Pre-Existing Containers¶
Pre-existing containers will not be automatically protected; this is due to the difficulting in distinguishing between a pre-existing container and a container that was intentionally excluded from protection.
- For a single container
With a super user auth token, enable history-based versioning on the container:
curl https://swift.host/v1/AUTH_user/$CONT -X POST \ -H 'X-Auth-Token: AUTH_tk....' \ -H "X-History-Location: .versions-$CONT"
The archive container will be automatically created as needed. If using a different
auto_enable_prefix
, theX-History-Location
header should reflect that.
- For an entire account
Python simplifies the process of listing containers, skipping archive containers, and checking whether versioning is already set up:
try: import urlparse except ImportError: from urllib import parse as urlparse import swiftclient auth_url = 'https://swift.host/auth/v1.0' admin_user = 'super_user' admin_key = '...' account = 'AUTH_user' conn = swiftclient.Connection(auth_url, admin_user, admin_key) url, token = conn.get_auth() info = conn.get_capabilities() if 'defaulter' not in info: exit('defaulter not enabled') container_format = info['defaulter'].get( 'default-container-x-history-location') if not container_format: exit('defaulter missing default-container-x-history-location') if not container_format.endswith('{container}'): exit('defaulter specified a non-prefix ' 'default-container-x-history-location') container_prefix = container_format[:-len('{container}')] url = urlparse.urljoin(url.lstrip('/'), account) conn = swiftclient.Connection(auth_url, admin_user, admin_key, 5, url, token) for container in conn.get_account(full_listing=True)[1]: if container['name'].startswith(container_prefix): continue meta = conn.head_container(container['name']) if 'x-versions-location' in meta or 'x-history-location' in meta: continue history_location = container_prefix + container['name'] conn.post_container(container['name'], {'X-History-Location': history_location})
- For all accounts
The SwiftStack Controller utilization API may be used to identify all accounts within a cluster:
try: import urlparse except ImportError: from urllib import parse as urlparse import time import requests import swiftclient.service controller_url = 'https://platform.swiftstack.com' controller_user = 'swiftstack_controller_user' controller_api_key = '...' cluster_id = ... auth_url = 'https://swift.host/auth/v1.0' admin_user = 'super_user' admin_key = '...' def iter_accounts(): utilization_path = ('/api/v1/clusters/%s/utilization/storage/0' % cluster_id) start = time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime(time.time() - 24 * 60 * 60)) offset = 0 while True: result = requests.get( urlparse.urljoin(controller_url, utilization_path), headers={ 'Authorization': 'apikey %s:%s' % (controller_user, controller_api_key)}, params={'start': start, 'offset': offset}).json() if not result['objects']: break for record in result['objects']: yield record['account'] offset += len(result['objects']) conn = swiftclient.Connection(auth_url, admin_user, admin_key) url, token = conn.get_auth() info = conn.get_capabilities() if 'defaulter' not in info: exit('defaulter not enabled') container_format = info['defaulter'].get( 'default-container-x-history-location') if not container_format: exit('defaulter missing default-container-x-history-location') if not container_format.endswith('{container}'): exit('defaulter specified a non-prefix ' 'default-container-x-history-location') container_prefix = container_format[:-len('{container}')] for account in iter_accounts(): url = urlparse.urljoin(url.lstrip('/'), account) conn = swiftclient.Connection(auth_url, admin_user, admin_key, 5, url, token) for container in conn.get_account(full_listing=True)[1]: if container['name'].startswith(container_prefix): continue meta = conn.head_container(container['name']) if ('x-versions-location' in meta or 'x-history-location' in meta): continue history_location = container_prefix + container['name'] conn.post_container(container['name'], {'X-History-Location': history_location})
Important
You will probably want to run this once immediately after enabling data protection, then again a few days later. This ensures protection for:
- Recently-created accounts that had not reported utilization before the initial run
- Recently-created containers that were not available in listings during the initial run
Retrieving Old Versions¶
Old versions of an object <object name>
will appear in the archive
container named <length><object name>/<timestamp>
, where <length>
is
the hexadecimal representation of the length of <object name>
and
<timestamp>
is the Unix timestamp that was that object's Last-Modified date
when it was in the protected container.
To list old versions of an $OBJECT
, do a prefix-listing of the
$ARCHIVE_CONTAINER
:
curl -H "X-Auth-Token: $OS_AUTH_TOKEN" \
"$OS_STORAGE_URL/$ARCHIVE_CONTAINER?prefix=$(printf \
'%03x%s/' ${#OBJECT} $OBJECT)"
Or with python:
cont, obj = ...
conn = swiftclient.Connection(auth_url, user, key)
cont_meta = conn.head_container(cont)
old_versions = conn.get_container(cont_meta['x-history-location'],
prefix='%03x%s/' % (len(obj), obj))
See Swift's Object Versioning documentation for more information.
Configuration¶
Data protection uses a heirarchical configuration, allowing for fine-grained control over which accounts and containers are protected.
Cluster-Wide¶
The Data Protection middleware has three configuration options beyond simple enablement:
auto_enable_prefix
- The prefix to be used for archive containers. When a new container
container
is created, it will automatically have versioning enabled with aX-History-Location: <prefix>container
header. Ordinary users will not be able to create containers starting with this prefix. owner_can_protect
Whether account owners can manage container protections. By default, only super users may:
- create or delete containers beginning with the
auto_enable_prefix
; - delete, overwrite, or
POST
to objects in such a container; or - toggle versioning on any containers.
- create or delete containers beginning with the
default_versions_retention
- How long old versions of objects should be kept once moved to archive
containers. When an archive container is first created, this will be used to
set the
X-Default-Object-X-Delete-After
header for that container. A value of0
will cause objects to be retained indefinitely.
Per-Account¶
To disable protection for an entire account, administrators may change the
X-Undelete-Enabled
setting for the account:
curl https://swift.host/v1/AUTH_user -X POST \
-H 'X-Auth-Token: AUTH_tk....' -H 'X-Undelete-Enabled: false'
Note
There is currently no way to change the default retention period for an entire account.
This will immediately
- lift the added restrictions on ordinary users' abilities
to set
X-Versions-Location
andX-History-Location
headers and - prevent
X-History-Location
from being automatically set on new containers.
Since users may still want to use X-History-Mode
, however, pre-existing
containers will still repect the currently-set X-History-Mode
. Similar to
enabling protection for an entire account,
you can loop over all existing containers if you really want to disable all
versioning within the account:
import swiftclient
auth_url = 'https://swift.host/auth/v1.0'
user = 'user'
key = '...'
conn = swiftclient.Connection(auth_url, user, key)
for container in conn.get_account(full_listing=True)[1]:
meta = conn.head_container(container['name'])
if 'x-versions-location' in meta or 'x-history-location' in meta:
conn.post_container(container['name'],
{'X-Remove-History-Location': 'x',
'X-Remove-Versions-Location': 'x'})
This may be done as an ordinary user.
Per-Container¶
To disable data protection for a specific container, an administrator may
remove the X-History-Location
setting for the container:
curl https://swift.host/v1/AUTH_user/container -X POST \
-H 'X-Auth-Token: AUTH_tk....' -H 'X-Remove-History-Location: true'
Any previously-protected archive container will continue to be protected, although the data in it will still expire according to the container's retention period.
To adjust the retention period for an archive container, an administrator
should change the X-Default-Object-X-Delete-After
setting for the archive
container:
curl https://swift.host/v1/AUTH_user/.versions-container -X POST \
-H 'X-Auth-Token: AUTH_tk....' \
-H 'X-Default-Object-X-Delete-After: <new retention period, in seconds>'
Note
This affects newly-archived objects; existing versions will continue
to expire according to their X-Delete-At
timestamp.
Note
Unlike the cluster-wide setting, a retention period of 0
will
immediately expire new versions. Use
X-Remove-Default-Object-X-Delete-After: true
to retain old versions
indefinitely.
Per-Object¶
To adjust the retention period for an individual archived object,
administrators may change or remove the X-Delete-At
header for that object
with a POST
request. See the swift documentation for expiring objects
for more details.
Note
The POST
needs to include any object metadata that should be preserved.
Alternatively, you may wish to COPY
the object to itself while
overriding the X-Delete-At
/X-Delete-After
headers.
Middleware Details¶
The data protection suite is composed of three related middlewares:
- A new defaulter middleware, allowing users and administrators to specify
headers that should be applied (if not already set by the client) during
PUT
requests, - A fork of versioned_writes which includes several in-review patches to:
- add support for a history-based versioning mode (available upstream since Swift 2.10.0),
- automatically create archive containers on demand if they don't already exist, and
- respect defaulter settings when copying data to the archive container, to
set
X-Delete-After
headers according to the retention period.
- A new data-protection middleware, which:
- limits write access to "protected" containers,
- authorizes the automatic creation of archive containers even if the requesting user wouldn't have been allowed to create the container, and
- restricts ordinary users' abilities to set certain headers, such as
X-Versions-Location
X-History-Location
X-Default-Container-X-Versions-Location
X-Default-Container-X-History-Location
X-Default-Object-X-Delete-At
X-Default-Object-X-Delete-After