- To protect Control-M against Database failures, backup of data base should be performed in such a way that one can recover to the point of failure. - Backing up using standard file backup tools are not supported while database is live. A file backup of the database file while database is live will break data integrity, data consistency and data completeness. - Control-M utilities provides two ways to perform backups of the application database.
Cold Backup: - The traditional way or the Cold backup can be performed when the application is down and no updates are being made to the database in order to protect the integrity of the data. - Cold backup can be inconvenient to a mission critical environment that needs to be up 24/7. - Additionally Cold backup will restore to the same point when the backup was performed. - For Control-M/Server database this is not ideal point for recovery.
Hot Backup: - Control-M provides a second mechanism for backing up a live database without having to shutdown Control-M application this is referred as a Hot Backup. - The integrity of the data is maintained in a Hot backup because together with the running database copy a Hot backup also includes a running transaction or sequence of changes to given point and time. - Control-M/Server database should be restored to the point of failure to insure that job flow processing continues from point of failure and not from few hour back. - Hot backup is made through optional configuration called “Archiving” mode.
Control-M Archive Mode for Hot Backup - When archiving is enabled the database incremental changes are written to Archived Log file and then after configured amount of time or configured size of transaction log, the archived file is closed and made available for shipping to a safe designated storage. - The Archived Log files are sequentially labeled from the time a Hot backup is started until the next time that a Hot backup is executed and the log file sequence is reset to the beginning. - The Archived Logs file should be stored on a safe storage location that is protected from machine and disk crashes. - An active Archived Log files is kept open while the incremental changes are taking place in the Database. - Since Archived Log file are simple flat files there is a risk that some of the data that was already committed is still buffered in the Operating System (OS) memory and not written out to the file. - To manage the OS file buffering, File Synchronization mechanism exists with most database vendors and this solution insures that for every database commit the memory buffers are flushed and the data is written to the disk. - The Postgres parameter “fsync”, when turned on, insures that for every database commit the memory buffers are flushed and the data is written safely to the disk. (There is always some performance penalty when the fsync parameter is turned to on. ) - On databases with light activity, it is also recommended to add to the file postgresql.conf the parameter “archive_timeout” with the value of 300 which will force an archiving activity every 5 minutes. - When Database utilizes flat files for data storage, OS maintains a file buffer in the kernel memory, and information left in the memory buffers can create a data inconstancy if the machine unexpectedly would crash or get rebooted. - This inconsistency can corrupt the flat files during a crash and lose the ability for database to be restored to the point of failure. Therefore, setting File Synchronization mode to flush the data on transaction commit and minimizing the size of the archive log files improves the ability to recover to the latest transaction before the point of failure. - The number of Archived log files can grow dramatically if there is a lot of activity therefore in order to make the recover process manageable we recommend that a Hot backup be made every 24 to 48 hours. - The hot backup creates a new baseline and thus makes all the previous Archived log files irrelevant. - As soon as a Hot backup is made the baseline for a full restore is created and this backup should be moved over to a safe location. - Following the Hot backup all the Archived Log files that will be create are the incremental backups and are now associated with the new Hot backup baseline. - A cyclic task should be put in place that moves the closed Archived Log files to a safe location to prevent the loss of this data in case of disk crash. - The interval for the cyclic task for shipping the Archived Log files and the size of each Archive Log file will depend on time one can afford himself to manually recover and restore to the point of failure in case of disk crash.
The following is a summary of backup strategy that is discussed above:
1. Enable Archiving mode. This option is necessary in order to run Hot backups.
2. Perform a Hot backup. Create a baseline of the database by performing a Hot backup. A Hot backup can be executed at any time without bring down Control-M application. Create a new baseline every 24 to 48 hours depending on the amount of activity and the size of the Archive Log files. Ship the baseline backup file to a safe storage location.
3. Ship Archive Log files As soon as Archived Log file is closed it should be shipped to a safe storage location. A cyclic task should be created that checks when the last active Archive Log file is closed and then the file should be shipped to a safe storage location.
More Information:
- Both Hot and Cold backup operations performs full database vendor backups using vendor utilities and creates a backup that is application independent. - In order to restore this backup Control-M needs to be down and the entire system is restored to the point of the last archived log file. - The full database can also work for safekeeping previous versions of the application data. - Performing a backup every 48 hours allows the user to restore the entire database to a previous time. - This is useful when user mistakenly modifies application data such as job definitions and needs to restore to some point in the past. - In order to restore application data from a full backup one needs to build or have ready a test or a failover standby environment where the entire database can be restored and then selectively extract the application data needed to revert the changes in the Control-M data. - See XML utilities for extracting EM specific set of application data from an environment. - Starting with version 6.4 Job version is support in the product and user can restore previous versions of job definitions using standard application functionality.