If your business is like most, it secures its information infrastructure by regularly backing it up onto tape. Or, you may have gone further, enhancing your backup strategy with expensive disk arrays and mirroring. If an earthquake, flood, blackout or hard-disk failure should catch you by surprise, these backups would ensure the survival of your company’s information.
Should human or software error — which account for approximately 40 percent of all application-related disasters — corrupt your data, you would simply reach for a recent backup, which would help you get your users back on their feet. But would simply having these regular backups stashed away someplace safe be enough?
Should your information infrastructure go down, regular backup tools may prove insufficient. While you doubtless will be able to restore corrupted data eventually from tape or disk arrays, you may find the recovery process takes too long. Data integrity may not be fully maintained, as you most likely will lose all information added or modified since the last backup was performed.
Even worse, there is a high probability that you will not be able to bring databases up again at all. This may result from any number of problems, including physical media faults and hardware/software incompatibilities.
In a nutshell, relying on traditional backups only may cost you precious downtime and make a serious dent in your productivity and profitability.
Snapshot-Based Backups
To ensure higher data availability and faster recovery from data corruption, IT-dependent organizations have been showing growing interest in snapshot-based backup solutions. A complement to conventional periodic backups — typically made once per day — these solutions maintain frequent watch over your data.
Snapshot-based backup solutions generally give you added security for the simple reason that they allow you to store the state of your data more often — typically once every few hours. Should corruption occur, you will no longer have to spend hours restoring all of your data from yesterday’s tape backup. Instead, you will be able to relatively quickly restore data from a previously stored system snapshot.
On the downside, disk activity typically needs to be suspended while snapshots are taken. Also, snapshots require tight integration with the specific application server being secured, and data integrity can only be assured if either the application itself or a system administrator invokes the snapshots in proper timing.
This means you will need to invest a significant amount of time and effort in configuring and customizing a snapshot-based backup system for your particular Exchange, SQL or Oracle setup. If you’re out of luck, and the most recently recorded snapshot is out of sync with your application’s latest consistent state, restoring data from the snapshot may have unpredictable results.
Snapshots are typically made once every few hours, and additional snapshots may be taken to create offline backups (most likely on a daily basis). Should disaster occur, and only tape backups or snapshots are available, you stand to lose as many as 24 hours of updates, not counting the amount of time required to carry out restore operations. At the very least, you will lose all updates made during the hours that have elapsed since the last snapshot or backup was made.
If corruption occurred even earlier than that, you will have to go back further and lose even more data. Also note that rapid restoration from snapshots will only be possible if you restore entire volumes. The restoration process will slow down significantly if you attempt to restore individual files, directories or databases.
Better than regular tape backup, but certainly not perfect.
Continuous Backup Gets the Job Done
Users interested in the highest data integrity and recovery speeds are probably best served by boosting their backup strategy with the latest continuous backup solutions. These may be added to your existing backup infrastructure — tape-based, snapshot-based or a combination of both — and will monitor your application servers while capturing and recording all operations (writes, deletes, copies and so forth) applied to them in a journal at all times. No data is actually moved around, as only the operations carried out on the data, and not the actual data, are logged.
Should data corruption occur, affected servers may simply be “rewound” by playing back an opposite operation (or “counter-event”) for each operation previously logged in the journal. Not only does this carry the benefit of allowing you to back up vast amounts of data accumulated over a long period of time — remember, it’s not the data itself that gets backed up, but the actions taken to create or modify it — but it also means recovery will be practically instantaneous.
Ultimately, continuous backup may be added and configured to monitor every update made to your servers, either all the time or, if you prefer, in between the two most recent snapshots. Should your data be corrupted, you will be able to choose between virtually unlimited restore points. Continuous backup is the only solution that will allow you to restore single or multiple databases to the most recent consistent state, sometimes logged just minutes before corruption occurred, so that the highest data integrity and negligible data loss, if any, are ensured.
As the duration of restore operations depends only on the volume of changes applied since the most recently journaled consistent state, and no actual data is moved, you will be able to simply “rewind” either single-megabyte or multi-terabyte servers in seconds.
How To Pick a Continuous Backup Solution
Solutions will differ in the speed of recovery, in the level of backup and restore automation that can be achieved, in the bandwidth overhead they will place on your application servers, and, obviously, in price. You should look out for extras. In addition to their logging of server update activity, some of the more capable solutions will allow you to store “manual” data consistency “bookmarks” at will.
This can prove especially useful when conducting server maintenance operations. Simply “plant” a bookmark before carrying out risky procedures and, should something go wrong, use it to quickly wind servers back.
Beware of jacks-of-all-trades. For best results, choose an application-aware solution that has been tailored to work seamlessly with your particular application server. This will speed deployment and database or application component discovery, and will provide you with the greatest ease of use and reliability. The preferred solution should at least provide dedicated support for industry-standard application servers, such as Microsoft Exchange, Microsoft SQL and Oracle.
You might also want to consider enhancing your backup strategy with real-time replication technology, which will add a high-availability layer to your data infrastructure, along with optional automatic fail-over. Generating a real-time copy of your data that is continuously kept up-to-date, this technology will achieve the most important benefit: 24-7 service availability for your users.
They will appreciate being able to work uninterrupted and will be even happier when, should disaster strike, you have them up and running again in a matter of seconds.
Leonid Shtilman is the founder and CEO of XOsoft, a provider of business continuity software solutions that enable instantaneous recovery from any type of disaster, including common data corruptions.