The Pleasure of Finding Oracle Things Out
Yesterday I was speaking with a colleague about various technical topics (as we frequently do) and he brought up a good question. He remembered reading that upon the issuance of the ALTER TABLESPACE END BACKUP command Oracle creates a redo record with the begin backup SCN and he wanted to know why Oracle needed to put this SCN in the redo record? More specifically, why doesn’t Oracle create the end backup marker in the redo stream with just the file number? Then during recovery, Oracle could simply read this redo record and consider all “hot-backup-necessary” redo for that file applied?
That seemed like a fair question. I told him that I had seen this begin backup SCN in a log file dump associated with an END BACKUP statement. So, I knew the information he read is correct. However, I could not immediately give him an answer for this mechanic. Of course, the fact that I could not initially get my mind around the mechanic and formulate a response other than, “I don’t really know” bothered me. I assume it is fairly common knowledge that all redo between the BEGIN BACKUP and END BACKUP commands is required for recovery; every good backup and recovery guide tells us this. But, I really didn’t know why the begin backup SCN was required in the redo record created from the END BACKUP statement.
I took pen to paper and reviewed what I knew of the END BACKUP statement.
- When issued it tells Oracle to unfreeze the frozen section of the header. Notice I said section. The entire header of each affected file is not immutable during the course of a user-managed hot backup. For instance, the hot backup SCN and checkpoint counter in the header is moving in accord with the database changes to the file(s).
- Oracle updates the master checkpoint SCN in the file header and control file with the hot backup SCN (the current SCN for the file). Remember a file in backup mode is still subject to normal modifications, checkpoints, etc.
- The status in the file header goes from 0x1 (hot backup mode) to 0x4 (normal online mode).
- A redo record for each file in the tablespace is created with the file number and begin backup SCN.
- During recovery after the end backup redo record is encountered the file in question has its status updated from 0x1 (hot backup mode) to 0x0 (consistent mode).
Here is an example of a redo record created for a file affected by the END BACKUP statement. I have bolded the begin backup SCN.
SCN: 0x0000.004b0b57 SUBSCN: 1 02/23/2006 15:59:59
CHANGE #1 MEDIA RECOVERY MARKER SCN:0x0000.00000000 SEQ: 0 OP:17.1
End backup marker - file:3 scn: 0x0000.004b0b4c
Okay, let me return to my colleague’s question. Why does Oracle place the begin backup SCN in the redo record and not just the file number? Consider the following:
At time y we take a complete hot backup of our production database. Of course we cannot instantaneously take a backup at a single point in time as it takes some time to get all tablespaces in hot backup mode. But, for the sake of this example let y be the “time” we associate with our most recent backup of the database. We also have complete hot backups of our database at times y-n, and y-2n where n is a day, week or some other arbitrary time interval.
What happens if we restore all but one of our files from the backup taken at y. Maybe, we had a corrupt version restored for one of the files and we had to restore a copy from the y-2n backup. That is, we have one file restored from y-2n and the rest from y. Our recovery would start using log files from y-2n as that is the oldest file in our restored database. In our redo stream we have redo records for the END BACKUP commands for the files backed up during the y-2n backup. Remember, each of these redo records contains this mysterious begin backup SCN. While rolling forward Oracle would hit these redo records from the y-2n complete hot backup. What would happen if Oracle simply put the file number in the redo record for the END BACKUP command? It would immediately update the file headers to 0x0 (file is consistent and needs no further “hot-backup-necessary” redo) for the files backed up at y. Well, this would cause major issues. The recovery would stop after all redo records containing the end backup marker from the y-2n backup had been encountered. However, Oracle provided another level of protection from this scenario, the begin backup SCN.
From the previous scenario, with the begin backup SCN embedded in the redo record for each file, Oracle would continue to recover through the necessary redo. It would consider a file consistent only when it found the redo record for the end backup statement that contained the begin backup SCN that corresponds with the begin backup SCN stored in the file header. The begin backup SCN is a portion of the frozen section of the file header when the BEGIN BACKUP command was issued.
All is well, now that it feels I have gotten my mind around this one.