What is snapshot retention and what does it mean to me?
A snapshot retention policy is a means to purge old data. This policy is necessary to prevent our storage system from becoming full, while ensuring that we keep sufficient point-in-time backups to meet client needs and SLA agreements.
The configured snapshot retention policy is used to control which recovery points will be retained on the device, how long those recovery points will be retained, and when the remaining recovery points will be automatically purged to conserve disk space.
- A snapshot retention policy can control the automatic cleanup of certain recovery points on both appliances and vaults, and can apply to both private Cloud and Axcient Cloud.
- Snapshot retention settings can be set entirely independently on each device.
- Snapshot retention settings can be different on appliances and vaults, and can be customized to fit the needs of partners and clients.
Two modes of snapshot retention
x360Recover supports two modes of backup snapshot retention: basic and tiered:
Basic snapshot retention
Basic snapshot retention is our legacy recovery point grooming system. This retention cleanup mechanism is simplistic: All snapshots less than x days old are retained. Any snapshot older than x days is purged.
Basic snapshot retention is not an efficient storage method. It is inflexible, and is not well suited to long-term SLA requirements for retaining historical recovery points.
Tiered snapshot retention
Tiered snapshot retention is the modern standard for backup retention and recovery point granularity. As the name implies, tiered retention is designed to give a more granular level of control over retained snapshots by specifically keeping a configurable set of snapshots grouped by timeframe. i.e. days, weeks, months and years.
Which snapshots will be retained?
With tiered retention, it is important to understand the selection method used for determining which snapshots will be retained vs those that will be deleted.
- Both basic and tiered retention start by retaining all snapshots for some number of days.
- Retention of specific snapshots (to satisfy the designated tier retention periods) begins after the last day of this keep-all period.
- The snapshot chosen for retention will be the last snapshot available within the given period.
For example, when calculating daily snapshots, the snapshot chosen for retention will be the last backup taken during each day.
But remember to consider the details of those longer time periods: The last backup during a weekly time frame might be on Friday, or Sunday, or even Tuesday afternoon, if backup frequency is intermittent.
Semantics: Po-TAY-toe or Po-TAH-toe?
- Tiered retention is predicated on retaining the last x snapshots for each given time frame. Read that again. Setting retention to keep seven daily snapshots does NOT (necessarily) mean you'll keep only snapshots from the last seven days. Setting retention to keep seven daily snapshots means you'll retain the last seven existing end-of-day snapshots.
So, if your protected system is configured to only take one backup per week, is that seventh snapshot retained from 42 weeks ago?
Not quite. It will actually be x number of days (based on your configuration for the Keep All snapshots value) plus 42 weeks old!
- All tiers are each calculated starting the day AFTER the end of the Keep All snapshots period.
In other words, each tier is calculated independently, starting from the same date: the end of the Keep All time frame.
- Unlike most chain-based backup solutions offering tiered retention (sometimes referred to as Grandfather-Father-Son or GFS retention), x360Recover does NOT (a) keep x daily snapshots, and then (b) keep another x weekly snapshots etc. All tiers concurrently overlap one another, by starting from the same date in time.
For example, if you retain fourteen daily and two weekly snapshots, both of those weekly snapshots overlap with one of the fourteen daily snapshots - and you have not added any length to your overall retention time frame.
Default retention settings
A newly-registered protected system will default to tiered mode retention with the following settings:
A newly-registered protected system will default to tiered mode retention with the following settings:
After fourteen days, tiered mode retention uses the following settings:
Zero as a retention tier setting
Setting any of the tier retention values to zero disables that tier. For example, you might set the yearly tier to zero to only retain some number of daily, weekly, and monthly recovery points.
You can also "skip" a tier by setting it to zero. For example, if you set weekly to zero, you will only keep daily, monthly, and yearly recovery points.
Note: You cannot set Keep All to zero.
Negative 1 as a retention tier setting
Setting any of the retention tier values to -1 keeps an unlimited number of snapshots in this tier. For example, set yearly to -1 to keep an unlimited number of end-of-year recovery points.
You can also "leap" a tier. For example, set the weekly tier to -1 to retain end of week snapshots forever and then set conscribed limits on monthly and/or yearly backups.
Note: You can only set Keep All to -1 in basic mode. (This setting is not recommended!)
Storage license retention
Storage licensing is a special x360Recover license mode in which protected systems are licensed by the amount of storage they consume on the appliance - rather than by the number of individual servers and workstations being protected.
Storage licensing includes
- unlimited endpoints backed up on a given appliance (within the licensed storage amount)
- unlimited storage within the Axcient Cloud for either a three-year or ten-year period.
Retention settings for endpoints licensed under this licensing mode may be set to any value on the appliance but are restricted (and locked) to a fixed retention policy on the Cloud vault (depending on whether a 3-year or 10-year license is assigned.)
Three-year storage license fixed retention policy (vault only)
After seven days:
Ten-year storage license fixed retention policy (vault only)
After fourteen days:
Data retention safety rules
One major goal of a retention cleanup mechanism in a disaster recovery system is to efficiently optimize storage space. Optimizing storage with tiered retention generally allows for larger numbers of older recovery points from which to recover files or perform wholesale disaster recovery.
But the primary goal, of much greater importance, is to ensure that sufficient historical recovery points are retained to recover from any disaster situation.
And to ensure that successful recovery experience, Axcient has added additional caveats and conditions to our retention cleanup rules. We prefer to err on the side of expanding the data available for recovery rather than focusing solely on optimizing storage.
The following rules modify and extend the pure retention schema detailed above:
The safety pass
The first thing our retention algorithm does is skip the first seven snapshots.
Regardless of the retention policy settings (whether basic or tiered), under no circumstances will retention cleanup remove any of the last seven remaining recovery points of a protected system.
The value of each tier is calculated as plus 1
Whatever value you enter for daily, weekly, monthly, or yearly, the retention engine adds one to it.
Why do we do this? Because calculating retention based on keeping the last recovery point is somewhat counter-intuitive. Determining in your head what your oldest snapshot will be is tricky.
For the most extreme example of this, consider yearly backups. If you set your schedule to retain one yearly snapshot, intuitively (in most people’s minds) that implies that you retain a whole year of data that you can roll back and recover. The last yearly snapshot will presumably be from one of the last days in December, from last year. But - if today is Jan 1, that ‘yearly’ snapshot is from last week!
- The bottom line is that retaining one extra daily or weekly snapshot has almost no impact on your overall storage usage. And retaining that extra monthly or yearly snapshot ensures that you really have that whole month or year of data history.
Note: Adding +1 to the value does not apply to values of zero or -1.
Backup interruption detection
The start of the Keep All retention period is based on the most recent snapshot.
If no new backups are taken, no snapshot cleanup will occur.
- For example, your appliance has the Keep All snapshots set to thirty days.
- On Friday evening, something happens to your production environment, all your systems are down, and no backups occur over the weekend.
- On Monday, your client calls for assistance with recovering their environment.
- On your appliance, you still have all the original snapshots from thirty days prior to the last backup taken which was on Friday.
A note on end-of-the-week oddities
End of the week snapshots have the following potential peculiarity:
- The ISO year may consist of either 52 or 53 full weeks.
- A week always starts on a Monday and ends on a Sunday.
- The first week of an ISO year is the first (Gregorian) calendar week of a year containing a Thursday. For example, if January 1 is a Thursday, the start of the week would be December 27 of the previous year.
Q: What happens during the retention passes if there have been no backups today?
A: The retention time window starts yesterday. Essentially, if backups stop occurring, retention cleanup will stop occurring.
Q: If a system clock is improperly set to the future, and then a snapshot is created, what happens when the clock is set backwards to the correct date/time?
A: The "retain-all-starting-date" is either the date/time of the newest snapshot, or the current date/time, whichever is older. (This ensures that a "parked" system will not start "losing" snapshots.)
Q: Are ALL snapshots newer than the "retain-oldest-mark-date" marked for retention?
Q: If I specify a value < 0 for my "effective-number-of-periods" setting, what will happen?
A: Your "effective-number-of-periods" settings will default to unlimited.
Q: What happens if I disable the "effective-number-of-periods" by setting it to zero?
A: The pass will be skipped without doing anything. No end of period snapshots for that type of period will be marked for retention.
Q: How is the oldest snapshot for a day interval determined?
A: The latest snapshot is chosen from all snapshots included in the date interval between 00:00 and 23:59 of the day of the year determined by the date parameters according to the date data type.
For example: To determine the last snapshot of the daily period, we select all snapshots that relate to the time between 2021-04-28 00:00:00, 2021-04-28 23:59:59. The oldest snapshot will be the one that is closer to the end time of the current day period.
Q: How do I determine the oldest snapshot for a week?
A: The oldest snapshot for a week interval is determined by selecting the latest snapshot for all snapshots included in the date range between 00:00 on the first day of the week and 23:59 on the last day of the week, as defined by the date parameter set according to the Python date data type "isocalendar" function.
Q: When a protected system is configured for replication, are snapshots deleted that have not yet been successfully transferred to all configured vaults and ingested into ZFS?
A: No, snapshots that have not yet been successfully transferred to all configured vaults and ingested into ZFS cannot be deleted and will not be removed by retention cleanup regardless of what settings the user has chosen for retention.
Q: Do retention policies vary based on the license type?
A: Protected systems licensed on an appliance using Storage Licensing have a fixed mandatory retention policy on Axcient-hosted vaults. (The actual policy is dependent on the Storage License selected (3-year or 10-year), with changes to retention settings on the vault locked out. Appliance retention settings may be selected at will.)
Q: Are there any situations were retention cleanup on vaults is blocked?
A: Retention cleanup on vaults is blocked for a protected system while network recovery back to an appliance is in progress. While the job is active, retention cleanup will not occur and snapshots will not be removed.
Q: How efficient is the retention cleanup algorithm?
A: The retention cleanup algorithm should use a reasonable amount of memory (< 1 GB) and take a reasonable amount of time (< 5 seconds) even for a protected system that has 10,000 snapshots.
More questions about retention settings?
Still have questions about setting your retention?
Contact Axcient Support at https://partner.axcient.com/login or call 800-352-0248