Follow

Backup process, performance, and benchmarks

The WholesaleBackup platform is robust and secure, and it’s beneficial to understand what is happening “underneath the hood” during your customers’ backups. In particular, this technical note describes the backup process and describes how this relates to performance, and optimizations you can make.

Process

During scan phase, the following is performed

  1. Every selection is followed and each selected file’s last modify date and time is captured
  2. All de-selections are applied to the list of selected files. Even though you may not have specified any de-selections, Windows does via the registry key HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\BackupRestore\FilesNotToBackup
  3. Filters (regular expressions) are then applied on selected files (filters for example, indicate not to backup .TMP files)
  4. No compress filters (regular expressions), if present, are applied (see Benchmarks section below for an example).

During backup, those files whose modify date and/or size have changed since the last time they were backed up, will then be processed for backup using a block differential method. The per file process requires that each file to be backed up is split into 1MB blocks, where each block is

  1. Encrypted using AES-256 encryption.
  2. Compressed using the specified compression level
  3. Check summed

Where only those blocks that are not present in the backup vault(s) are then transmitted (over 128 bit encrypted TCP tunnels) to the vault(s), either in sequence or in parallel depending on the configuration specified on the Backup->Settings tab. Like a file system then, restoration of a file involves retrieving the blocks in the correct order (but in this case decrypting, decompressing, and verifying their checksums as well).

Performance

Factors that affect performance are:

  1. Speed of client’s local disk(s) and network shares. Speed of the local disk drives (or NAS/SAN) of your WholesaleBackup Server. NOTE: Most inexpensive USB drives perform poorly with many small files (i.e. our blocks) so you get what you pay for.
  2. Speed of the local disk drives (or NAS/SAN) of your WholesaleBackup server.
  3. IMPORTANT: Ability of your storage “devices” on your WholesaleBackup server to handle multiple/simultaneous readers/writers. On your WholesaleBackup server, where possible, use SAS drives and configure your RAID to optimize parallelism.
  4. The upload speed of your client’s Internet connection (download speed for restores).
  5. The download speed of your WholesaleBackup server’s Internet connection (upload speed for restores).
  6. The RAM and CPU capabilities of your WholesaleBackup server to handle multiple simultaneous threads writing to and reading from the server’s storage “devices.” (You can configure the # of threads the WholesaleBackup server uses or add additional backup servers by following these directions: https://support.wholesalebackup.com/hc/en-us/articles/200555594.

So, both client and server performance affect how long a backup takes. You may wish to disable remote backups on a 1st backup and perform a local only to a FAST USB or fire wire or thunderbolt storage device, perhaps with compression turned off for certain large files (as described in Benchmarks section), then physically transport that data to your server, so all subsequent backups become block differential backups with de-duplication. More information on seeding backups can be found here: https://support.wholesalebackup.com/hc/en-us/articles/200601690

During backup, AES-256 encryption takes a bit longer than our competitor’s AES-128 encryption, but the extra security is well worth it. In addition, there is some overhead associated with check summing and block de-duplication vs. a straight file copy, but this will result in fastest possible restore times and minimum server side storage. Furthermore, for security reasons we only transmit data and commands over SSL encrypted tunnels so there is some overhead for this.   Overall, the additional overhead for all these things save you and your client’s bandwidth and storage costs and ensure your customers’ data is as secure as possible.  

A couple of additional considerations with respect to backup and restore performance are: (a) status messages are sent to your WholesaleBackup server so that you know exactly what each of your backup clients is doing. These status messages are sent encrypted over SSL tunnels on threads separate from the client’s main backup processing threads so they don’t typically slow down backups or restores. However, if your client loses Internet connectivity or its throughput (or your backup server’s throughput) is very slow, performance can be hindered, as the message threads can become a bottleneck to the processing threads if if they all go into a wait state. The other consideration (b) is that local backups to USB drives, particularly USB 2.0 and earlier, can be very slow when hundreds of thousands of blocks are written as most USB drives do not have RAM buffers, nor do they handle multiple readers/writers well, and they are often slow 5400 RPM ATA/PATA/SATA drives.

Benchmarks

On a laptop with SSD drive using version 14.05 (in a VM) of our backup client, backing up to a WholesaleBackup server, also running version 14.05 (in a VM), on a 1Gb network connection:

  1. Scan throughput of more than 52K files/min for a scan with 1+ million files
  2. Backup throughout of about 500 small files/min for parallelized (simultaneous) remote and local backups
  3. Backup throughput of 90 MB/min (12 Mb/s) for parallelized (simultaneous) remote and local backups of large files.
    1. NOTE: Turning off compression in the client boosted performance to about 400 MB/min (53 Mb/s) with the same data, so, for large files which do not compress well, or for already compressed files, we recommend either overriding the default compression for all files on the Backup->Settings tab, or for files with specific file endings using a do-not-compress line such as the following in the .SEL file:
           % <.*\.((vhd)|(edb)|(mdb)|(bak)|(zip))$>

What these actual statistics demonstrate is that our software is capable of very respectable performance with modern hardware and network infrastructure.   Real world conditions will vary greatly depending on client and server side hardware, network, and utilization. The Performance section above will help you better understand what aspects of your architecture may be the bottleneck.

Summary

WholesaleBackup provides a secure, reliable, and high performance backup and restore architecture, with block level differentials and de-duplication. It has been architected for security, reliability, fast restores, and to minimize online storage and Internet bandwidth usage; and as such, backups take longer than most local only backup solutions so direct comparisons of performance are not justified. This technical note provides you with details on what factors impact performance, what to expect with good hardware and networks, and leads you to steps you can take to improve performance.

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

  • Avatar
    Frank

    What does block de-duplication mean? is there a difference in meaning with block duplication?

Powered by Zendesk