Table of Contents

How to do backups

EXCELLENT INTRO: The Tao of Backup

So that you don't get bit rot (bit flip) or worse later down the line.

The main point that I see missing from lots of solutions is that they don't “scrub” your data very often, if at all. By scrub, I mean check every byte (typically done with a checksum) of both the source data and the backups to make sure they aren't silently corrupted over time because you haven't been accessing them.

Thankfully, disk manufacturers include error correction data by default, but it seems to only occur when you access the file, although some seem to mention background scrubbing of data. So, if you don't touch every byte of that hard drive when you run your backup script, the errors can accumulate and can screw you up! Not sure how long exactly yet, it's a birthday/hash collision problem though. See Collision Probability

Ubuntu scrub solution (check for bad blocks): http://linux-sys-adm.com/how-to-check-for-bad-sectors-ubuntu-linux/

Windows you can run a full smart scan with HDDScan. But might want to try DiskCheckup next time for a full scan. Seems like more information and better interface.

A good solution should

I like Dropbox so far if you want an automated way to do this for <5-10GB of data for free. Their 1TB tier is $8.25/month. Just don't launch it on startup, it likes to scrub all your files before letting you use your computer :-)

Collision Probability

Google says they see a 5 bit / hr / 8GB error rate for RAM ECC_memory. Wow…didn't think about ECC RAM, yikes!. They indicated that the errors are correlated with high CPU activity, so probably isn't as useful in our case for HD backups (although it actually is in another sense).

So…pick a value, scale it up to the size of your HD, then use the Birthday_problem#Cast_as_a_collision_problem calculation to find how many flips we need to get a reasonable probability of error. Then make sure we scrub more often than that!

How many years do we need to wait? Assuming 5 flips per hour per 8GB and stopping once 50% probability of collision within the same byte.

Checksumming File Systems

BTRFS/ZFS/ReFS (windows), they keep a checksum of each file/sector? (not sure which) along with the files in a custom file system. Apparently rsync doesn't keep checksums, so … keep multiple backups as a final arbiter in case a checksum doesn't match or use a checksumming file system.

Test hard drives long term

disk-filltest and maybe Data Test Program (dt). Other options for memory are memtest86 and CPU are mprimes. Contained on http://www.ultimatebootcd.com/ and Hirens boot cd.

disk-filltest

// For writing data ...


// For verifying data...
sudo ./disk-filltest-64bit -C /media/nhergert/foo -r -s 0 -S 100

3 months later, no data corruption on 750GB, interesting :-)

Can check S.M.A.R.T. data on Ubuntu using gsmartcontrol. Make sure to run sudo :-)

Using nohup to log to file and continue to run after I ssh out. http://unix.stackexchange.com/questions/101529/can-a-shell-script-running-in-a-ssh-continue-to-run-if-the-ssh-instance-closes

Synology

Looks great for a DIY collaboration dropbox.

Unfortunately WebDAV doesn't work over the normal quickconnect. Need to set up a DDNS, which seems doable? https://www.synology.com/en-us/knowledgebase/DSM/tutorial/General/How_to_make_Synology_NAS_accessible_over_the_Internet#t3

Backup Data Reliably

Conclusion: Dropbox's $10/TB/month is probably good. If you want to do it yourself, be sure to check the reliability of all your data often, which can be done using the S.M.A.R.T. thorough test or using BTRFS/ZFS “scrub” features.

Pre-done

Amazon glacier is currently $.007/GB/month to store, which is really similar to Dropbox' $10/TB/month.

How do you ensure that the source data is not corrupted? For example, do you checksum all data every time the app is opened? How do you (or anyone) handle an I/O read error from an uncorrectable bad sector? Do you update the sector in the source file if it's corrupted and you have a good backup copy?

“Scrub”: force the hard drive to correct all bits using ECC / file system to correct all bits using checksum/backup on the source drive

Optional questions if you know:

Thanks!

Nolan

DIY

rsync --dry-run --checksum --itemize-changes --exclude "*/homerot/*" --exclude "*/551_projects/*" -azR /media/nhergert/Ubuntu/NolanBackup/home/nhergert/DropboxArchive/ /media/nhergert/a5cbb8a7-e29f-4f30-b09f-e1e3bd17746d/home/nhergert/DropboxArchive

BTRFS probably

Sync files using rsync, ssh key copied. On write of changed files, rsync will __ (unsure Periodically run btrfs-scrub to check state of bits in both backups.

Won't

Bad?

ZFS / BTRFS protect against bit rot, and let you “scrub” the data to verify the checksums.

Ubuntu howto on BTRFS, how to test. Not sure on long term file structure reliability yet. But, just include a copy of that OS on there and you should be fine…

Don't really want to use ZFS as it's generally 64-bit only and unstable on 32-bit.

Datacenter Admin

Erasure coding is more efficient storage-wise than RAID: https://www.intel.com/content/www/us/en/storage/erasure-code-isa-l-solution-video.html. Just requires some CPU overhead. Can use Ceph, which probably supports Intel CPUs. https://en.wikipedia.org/wiki/Ceph_(software)