Friday, February 1, 2008

How to do Recovery on MS SQL Server Database ?

One day, our server that hosts MS SQL is crashed. May be this is caused by some worms that infected the Windows system, or maybe it is caused by crashed in physical media storage, or maybe whatever.

Upon restarted, about 40% instances of the database collection in MS SQL Server (we are using SQL 2000) turned gray or suspected. So I choosed one of the database to be examined.
After many failed attempt to access the data in one instanced of the grayed database, I tried to detach this database. When tried to attach it again, I got this message :






So the reason that this database is grayed out in SQL Enterprise Manager (EM) is because of this problem. Next, I am trying to do this using the stored procedure "sp_attach_db" in Query Analyzer :

sp_attach_db 'edoc_test3', 'c:\mydata\edoc_data.mdf'

It gives the following error message :

Server: Msg 823, Level 24, State 2, Line 1
I/O error (torn page) detected during read at offset 0x00000000012000 in file 'c:\mydata\edoc_data.mdf'.


Connection Broken

It says there is torn page on offset 12000 hex on this file. I don't want to elaborate about this "torn page" things that already torn apart many lives that depends on MS SQL Server, rather than to say that SQL refuses to accept the defected database because some small parts of it is damaged.

The greyed status (suspect) and this problem is mean the same thing, i.e. I can't perform DBCC CHECKDB or DBCC PAGE on either the greyed (suspect) database and when the database is not attached.

In the above case, the error message given by Query Analyzer is more specific than Enterprise Manager.

The idea is to find out which page is damaged and replaced it with clean one. This technics will pass the preliminary checking of database during attach process so that the database can, at least be attached and the status icon turned yellow. Yellow means we can perform DBCC CHECKDB and DBCC PAGE.

We remember that from MS SQL Documentation of internal storage structure, the physical file is organized on blocks with size of 8192 bytes. These blocks is called pages and it is start from 0 to n.

And we also have to derive the exact location in the file where the page is defected. From the above information, offset 12000 hex = 73728 dec offset in the file.

Using the raw hexadecimal editor as WinHex, we can perform the 'surgery' of this page, provided that we have the backup of the good one. The back up of the good one have to be as close as the damaged one, the more close, the better.

From the good one, we go to the same offset, copy the 8192 bytes block and patch it to the same offset on the damaged one. This is performed with the hexa utility such as WinHex.

Repeat these steps to patch the damaged page with the good one from backup until the database can be successfully attached. Perform DBCC CHECKDB and DBCC PAGE to examine and extract the damaged data.

Some points to remember is to compare between good and damaged blocks.

Happy recoverying :)