A bit of theory
There are two types of file backup: online backup and long-term backup. Both of these types are necessary, but are used for different purposes. By using online backup you can restore yesterday version of a document, database, file which was deleted by mistake. A long-term archive is used mostly for needs of insurance or history, for restoring a database dated long ago or a file deleted or modified way back. An online archive is created at least every day. In fact, a long-term archive is one of online copies and can be created weekly, monthly, etc. and be stored on an independent storage medium such as optical disk, magnetic tape or hard drive on a remote server.
What are the types of backup?
- Full backup. All selected files are archived. Every following archive isn’t related to the previous one.
- Incremental backup. At first a full backup is implemented, then only new files and file modifications since the last backup are archived.
- Differential backup. At first a full backup is implemented, then only new files and files modified since the last full backup are archived.
Everything is clear with a full backup: all selected files are archived. If you do a full backup every day, you can always restore a relevant file or folder from a specific archive. It is convenient for a fast recovery but is not economical in the context of disk space and archiving time. If the size of the archived data is not large and you have plenty of disk space for archive storage, then it is a perfect option for a system administrator and businessman: any file, folder, database will be restored in a matter of minutes.
Incremental backup is the most time and space saving, but the recovery of a frequently modified file can drive a sysadmin crazy as it requires a successive recovery of all file modifications from all archive copies. And what if there are 10 such files, or 100? And all intermediate archives are stored on different media?.. Moreover, if even one of the intermediate archives is damaged, you won’t be able to restore the file. This type of backup is lowly reliable. The conclusion: if you want to restore something quickly and without any problems, don’t use this method as time and data make money.
Differential backup is something between the first two types. The program monitors which file was modified or appeared since the last full backup and places it into a new archive. So every new differential archive is larger than the previous one and, technically, in the course of time will be growing endlessly. Hence, the main thing is to define an ideal archiving time period after which a new full backup archive will be created and everything will start over again. To restore a folder you should have the last full copy of archive and the copy of the required date. As you can see, this type is more reliable and quite fast in terms of data recovery.
Typical requirements for file backup
What are the typical small company requirements to the file backup program and its archive? Here you can see them:
- an archive should be created quickly
- data from different folders and servers should be stored in one archive
- a created archive should be of RAR, ZIP or 7z format
- an archive of a required date should be found quickly
- needed data should be easily found in an archive
- needed data should be restored to a required place quickly and easily
Frankly speaking, all of this is pretty difficult to accomplish in one program. For example, the KLS Backup 2011, a program I liked, did everything I needed but in no way wanted to work with folders under non-Latin names, other enterprise backup programs were rather expensive and not flexible enough. So, none of the tested programs met my requirements for some reason or another. And it may be okay as it’s impossible to design a good universal program that would suit everyone perfectly. Every company has its own business process, its own set of software and hardware. Maybe there is somewhere the ideal, but I haven’t found it yet. So I had to reinvent the wheel.
How to make the perfect backup
My task was to organize a backup of different types of data stored on several servers, different logical drives and folders, spending as little money as possible and taking into account the requirements listed above. Besides, the process of backup should take no longer than 12 hours. The total volume of archived data amounted to about 50 Gb. Can you imagine how much space it took to store all everyday full archives?
I decided to write own backup script. I can state that the best way to implement a perfect backup solution for your system administrator is to write own script performing all necessary operations. You can create such script in fast and easy way with tool called Dr.Batcher ( http://www.drbatcher.com). It it the perfect solution to create batch files, it is handy batch file editor for all system administrators.
As an archive program, I decided to use the WinRAR archiver, to be more precise, its console version rar.exe. The advantages of this archiver are as follows: a great amount of command-line parameters, 64-bit version is available, multicore processors support, high archiving speed at a high data compression. And it costs only $29 for a license.
To my mind, the following workflow with data backup will be perfect for all types of small business:
- For financial databases a full backup is implemented so that no time is wasted for their successive recovery as the idle-time cost is rather high. The archiving period is 10 working days. Custom storage folder is created for each archive.
- For custom files a differential backup is implemented. The archiving period is 10 working days. Custom storage folder is created for each archive.
- The task is carried out in bat-files whose work is logged. Log files are stored in a child directory logs on the same server.
- The script should be run on the most powerful server. In my case it is quite a hefty HP server with 2 quad-core Xeons of 2.8 GHz each, RAM 18 Gb, Windows Server 2008 R2 Standard x64.
- First, each archive is created in a temporary directory on the archive server. Then they are automatically moved to the remote server for being stored. Experimentally, I found out that this method is the fastest one. All servers are connected into a gigabyte network thus enabling fast backup. For example, a 14 Gb archive is backuped within 4 minutes.
- The backup starts no sooner than in an hour after a workday ends.
- Each archive has a title which consists of a topic name and the archive creation date in a YYMMDD-form. For example, the archive of disk X of January 12, 2011: Disk_X110112.rar
- If there are more than 10 archives in a directory, the oldest archive is deleted automatically.
- Every month archives are manually transferred to the independent storage medium for long-term storage for the term of one year.
After I replaced the previously used program by new scripts, the duration of archiving has reduced from 15 to 3 hours! Besides the in-use space on the archive disk has been halved due to the application of differential archives, and I could lengthen the online backup period. All of this was worth working on, wasn’t it?
Bio:Roger Vadey is an expert in batch files. He is known as a system administrator with more than 10 years’ experience and cofounder of Mental Works Computing Software, an independent software vendor focusing on products for different automation.