Resumable Replication / Replication 2.0 - x360Recover

Written By Tami Sutcliffe (Super Administrator)

Updated at April 12th, 2022

With the release of version 8.1.0, the replication engine has been completely replaced.  Replication 2.0, generally referred to as ‘Resumable Replication’ is a total overhaul to the way replication is performed, providing both a more robust transfer mechanism and a 3x-5x improvement in end-to-end replication performance.

What is Resumable Replication? 

Replication has been rewritten to take advantage of the native snapshot transfer resume functionality available in ZFS on Linux 0.7+.  The new Resumable Replication engine no longer stages backup snapshots in the aristosbay folder on the Vault for later ingestion.  Now snapshots are written directly to their ZFS storage targets using native ZFS operations end to end.  Interrupted transfers can be immediately resumed from the last successfully delivered data block without have to perform a long, CPU intensive search to restart the transfer.

What does this mean for my vaults?

Overall vault transfer performance will now be 3x to 5x faster than the previous model.  With legacy replication, many systems transfer data to the staging area simultaneously, but only one snapshot from a single protected system can be ingested at a time. 

Also, the process was previously very inefficient overall, because every block of data was processed three times, first writing it to aristosbay, then later reading it from aristosbay and finally writing it again to the protected system partition within the storage pool.

With Resumable Replication, the data is only processed once, writing directly to the storage pool as it streams in over the Internet.  In addition, all protected systems can commit their data simultaneously, so there is no single-threaded ingestion bottleneck.  This provides a 60% decrease in total IOPs necessary to commit data to the vault.

Resumable Replication is built upon a new Vault Transfer Service, VT2.  VT2 eliminates the need to perform synchronous writes when writing data into the storage pool.  By enabling asynchronous writes, disk write caching is enabled, also substantially improving disk I/O performance.

How will the upgrade to Resumable Replication work?

Legacy replication will continue to function normally until both your appliances and vaults are upgraded to v8.1.0. 

Once both vault and appliance are upgraded, Resumable Replication will automatically be enabled. 

Any protected systems that are not in sync with the Vault and have a backlog of snapshots waiting for ingestion will pause further replication.  Once the Vault has completed ingesting all outstanding snapshots in aristosbay, that Protected System will immediately resume replication using the new Resumable Replication system. 

As more and more protected systems on your vault (and across the entire Axcient datacenter) catch up and transition into Resumable Replication mode, storage performance will improve, creating a cascading effect to speed along the ingestion of larger and larger backlogged systems.