Why backups fail
According to a 2010 Forrester Research study, nearly 40% of all backups fail. Some industry experts say that number is very conservative, giving figures in the 50 and 79th percentile. No matter which numbers one uses, it’s obvious that we have a problem when it comes to back up reliability.
How Come?
There are five basic reasons that backups fail:
- Media failure – Media failure ranks at the top of nearly every list of reasons backups and restores fail. For this reason, it’s important to treat your backup media with respect and use it intelligently.
In the case of tape, this means making sure you follow the vendor’s directions for handling and storage, replacing the tapes regularly and cleaning the drives according to the manufacturer’s schedule. It also means discarding any suspicious tapes.Don’t assume disk-based backup protects you from media-related failures. While the incidence of media-related failures is considerably lower with disk than tape, failures still occur on a more frequent basis that is acceptable.
Scratched CD/DVD’s, backup disk failure, failed backup tape, all can be put under the Media Failure category.
- Human error – In spite of its No. 2 ranking, human error is probably the most prolific cause of backup failures. For example, if tapes are improperly stored between uses, is the resulting failure a media failure or a human error? Usually there’s a significant component of human error in any backup failure.
The best safeguard against human error in backups is to train those involved to follow best practices. Make sure that the people performing backups and restores understand exactly what they need to do — and what not to do.
It is also a good idea to take the person out of the loop as much as possible. Ideally, backups should not require any human action. Be especially cautious of situations where backup isn’t part of someone’s main duties — for instance, someone in a branch office who’s been asked to make a backup tape every night.
- Software Failure – Sometimes new software or new versions of software can cause backup failures. For example, Service Pack 2 (SP2) for Windows XP turns on the firewall by default. When Microsoft released SP2, a lot of network backups failed because the backup software wasn’t designed to work through a firewall.
More commonly, the problem is misconfiguration. Modern backup software is extremely flexible; in other words, you have a lot of options to choose from and choosing the wrong options can result in incomplete backups or backups that fail totally.A related problem is that backup configurations are no more static than anything else in a modern storage environment. As resources are added and shifted and priorities change, the list of files to be backed up needs to change as well.
- Hardware Failure – Tape drives, libraries, disk arrays and other backup hardware can also fail. Most of the causes and failure conditions for backup hardware are the same as for other kinds of hardware, but there are a few conditions that are specific to back up systems.
For example, drift produces a particularly nasty kind of failure in tape drives. As the drive ages, the heads slowly wander out of alignment. As a result, other drives can’t read the tape — and the drive can’t read a tape it wrote some time ago. The nasty part of this is that the drive can almost always read a tape it just wrote, so the tape passes an immediate verification step in the backup process without complaint. - Network failure – Backing up over a network increases efficiency by reducing the number of backup devices. However, it also introduces another point of failure into the backup process. Everything from a failed or flaky HBA to a misconfigured switch can cause a backup to fail.
This is a less prolific source of backup failures because the network, LAN or SAN, is used for much more than just backup, so problems will tend to become obvious before they can hurt your backups.
What to do?
Whatever the cause of failure, the best way to keep them from damaging your organization is to verify your backups by performing regular test restores. Testing your backups regularly won’t prevent backup failures, but they can help in noticing the issue and this will allow you to fix the problem before you really need those backups and you get a nasty surprise.
Beyond that, the following steps are easy:
- Online Backup – subscribe to an online backup service. Given that you’re a business, it doesn’t make sense to use a ‘home’ subscription of a backup service. Make sure you use a business-class service. If they’re offering ‘unlimited’ backup at a low price, it’s probably too good to be true. We’ve heard reports of hassles getting past 40gb, having to request additional backup space, and upsell pitches by vendors who offer the ‘unlimited’ service. See our post abouot the differences between home and business backup.
- Use the proper software – If you insist on a local backup solution, use reputable software, such as Acronis, Asigra, etc. Don’t use the application that came with your system, whether it’s Windows, Mac or Linux. Those systems are fine for your teenager’s hme use (mostly), but aren’t reliable enough for business use. Equally, this is not an area to scrimp, so avoid open-source software (and I don’t say that very often…). This is an application for which you’ll need support if something goes wrong.
- The human factor – What can be said here…we’re all human…we all make mistakes. When it involves backup of critical data, however, it can be pretty unforgiving when you have a failed backup from a business perspective. Still, if you’re going to insist on doing a local backup as your only solution, you’re going to involve SOMEBODY doing SOMETHING, and there’s a potential point of failure. Using tapes? Check your backups by doing a test restore. Using USB flash/thumb drives? At the very least, attach the drive to your key chain so it goes with you when you leave the office at night. Schedule your backup for the end of your business day.
- Hardware failure – Again, if you really feel that this is the cheapest way to go, purchase a second hard drive along the one you purchase to use for your backups. Keep it safe, cool and dry. Unused, a new SATA hard drive will last at least 4 years before the mechanics dry out and it becomes unreliable to use.
- Network – Minimize the route between the device to be backed up and your storage device. Mapping a virtual drive is perhaps the easiest way. Check your backup media frequently to ensure that the schedule backup has taken place.
Of course, being in the business of selling online backup, you would think that fact was incentive to urge you to use online backup. While we’d hope that you become a customer, we suggest that you look at the various online backup options available, including us. Our chief differentiator is that all subscribers are randomly checked through test restorations daily, which means that, chances are, your data will be tested at least once every two weeks, and generally more often. Other companies widely advertise: SOS Online Backup,Carbonite, and Mozy are the most familiar names. We suggest you look them all over, and then take a look at our offering before making a decision.
Whatever you choose, be aware of the risks involved in each method of backup, and take the appropriate steps. Being aware of, and remediating the issues involved in your type of backup will go a long way towards ensuring the success of your backup strategy, and put you ahead of over 70% of all small businesses.
Add A Comment