Subscribe by Email

Sunday, September 6, 2009

What is the ARIES Recovery Algorithm ?

'Algorithms for Recovery and Isolation Exploiting Semantics', or ARIES is a recovery algorithm designed to work with a no-force, steal database approach; it is used by IBM DB2, Microsoft SQL Server and many other database systems.

Three main principles lie behind ARIES:
- Write ahead logging: Any change to an object is first recorded in the log, and the log must be written to stable storage before changes to the object are written to disk.
- Repeating history during Redo: On restart after a crash, ARIES retraces the actions of a database before the crash and brings the system back to the exact state that it was in before the crash. Then it undo the transactions still active at crash time.
- Logging changes during Undo: Changes made to the database while undoing transactions are logged to ensure such an action isn't repeated in the event of repeated restarts.

The ARIES recovery procedure consists of three main steps :
- Analysis : It identifies the dirty (updated) pages in the buffer and the set of transactions active at the time of crash. The appropriate point in the log where REDO operation should start is also determined.
- REDO phase : It actually reapplies updates from the log to the database. Generally the REDO operation is applied to only committed transactions. However, in ARIES, this is not the case. Certain information in the ARIES log will provide the start point for REDO, from which REDO operations are applied until the end of the log is reached. Thus only the necessary REDO operations are applied during recovery.
- UNDO phase : The log is scanned backwards and the operations of transactions that were active at the time of the crash are undone in reverse order. The information needed for ARIES to accomplish its recovery procedure includes the log, the transaction table, and the dirty page table. In addition, checkpointing is used.

Log records contain following fields :
- Type (CLR, update, special)
- TransID
- PrevLSN (LSN of prev record of this txn)
- PageID (for update/CLRs)
- UndoNxtLSN (for CLRs)
* indicates which log record is being compensated
* on later undos, log records upto UndoNxtLSN can be skipped
- Data (redo/undo data); can be physical or logical.

Transaction Table :
- Stores for each transaction:
* TransID, State.
* LastLSN (LSN of last record written by txn).
* UndoNxtLSN (next record to be processed in rollback).
- During recovery:
* Initialized during analysis pass from most recent checkpoint.
* Modified during analysis as log records are encountered, and during undo.

Dirty Pages Table
- During normal processing :
* When page is fixed with intention to update
"Let L = current end-of-log LSN (the LSN of next log record to be generated).
" if page is not dirty, store L as RecLSN of the page in dirty pages table.
* When page is flushed to disk, delete from dirty page table.
* Dirty page table written out during checkpoint.
* (Thus RecLSN is LSN of earliest log record whose effect is not reflected in page on disk).
- During recovery :
* Load dirty page table from checkpoint.
* Updated during analysis pass as update log records are encountered.

Checkpoints :
- Begin_chkpt record is written first.
- Transaction table, dirty_pages table and some other file mgmt information are written out.
- End_chkpt record is then written out.
* For simplicity all above are treated as part of end_chkpt record.
- LSN of begin_chkpt is then written to master record in well known place on stable storage.
- Incomplete checkpoint.
* if system crash before end_chkpt record is written.
- Pages need not be flushed during checkpoint
* They are flushed on a continuous basis.
- Transactions may write log records during checkpoint.
- Can copy dirty_page table fuzzily (hold latch, copy some entries out, release latch, repeat).

No comments:

Facebook activity