In my previous posts, I already explain the method to enable the crashed database to be attached, especially crashed database caused by torn page error.
The second phase (or second part) after this, is to examine the crashed data in terms of pages that is damaged or broken. For example, pages that has torn page error that we already 'fixed' in order that database can be attached.
Please remember that what we do at this phase is only to fix the torn bits so that SQL server do not trip on this error, but the integrity of the data is still questionable. Now the question is, is it possible to correct, or at lease determine the causes of page damage ? In this post, I will describe how it is possible and the restrictions about this issue.
The context of data recovery phase that I am about to explain is applied to MS SQL Server 2000 only, but this knowledge can also be expanded to be applied as a starting point for MS SQL Server 2005.
First, the restrictions. The restrictions I am talking about is by assuming the integrity of the MS SQL Server's LOG data is intact. As we already know, MS SQL Server technology is using the write-ahead method when writing the database, i.e. the system first write to the log file and ensure that the log file (LDF) is written successfully before applied to the actual database (MDF).
Using this assumption, theoretically, by using the database log file it is possible to view the modification that is about to be done/already done to the database seconds before the system crashed.
So, in order to determine which data is valid in the event of database crash, or particularly the page damage, I am relying on the log file as the reference point of operation that is about to be or already done to actual MDF file, seconds before disaster. Or in short, I am relying to the log file as the only valid data to be checked against the data in MDF file.
To present the sample case, I will have some quick review of the steps that I already elaborated on previous post, and then go on to second phase of data integrity checking. In the process of the sample case walkthrough, I am also present some caveat or issue that is emerged, and unless this issue is resolved, I can not go to next step.
Usually, the surfaced issue is about the design of MS SQL Server 2000, for example, the inability to view the log file when the database mode is in emergency mode or bypass mode. I am hoping that, this will be resolved in the next version of MS SQL Server.
First, I am trying to attached the damaged database file (MDF). And I am facing with torn page error :
Server: Msg 823, Level 24, State 2, Line 1
I/O error (torn page) detected during read at offset 0x00000000012000 in file 'c:\Program Files\Microsoft SQL Server\MSSQL\Data\edoc_data.mdf'.
I am recording this offset, so I can examine it later against the database log file. After fixing the tornbits (m_tornbits), and re-attached it :
Server: Msg 1813, Level 16, State 2, Line 1
Could not open new database 'edoc'. CREATE DATABASE is aborted.
Device activation error. The physical file name 'D:\Program Files\Microsoft SQL Server\MSSQL\data\eDoc_Log.LDF' may be incorrect.
This is the first issue that is I am talking about in MS SQL Server 2000. I tripped this error because, the damaged database is located at D: drive, whereas the server that the database I am about to be examine is using C: drive as location for the database. There are two ways to correct this issue, either create the exact location in D: drive in destination server, or using utility such as hex editor to change the drive in MDF file, when we have no D: drive in destination server.
After resolving the above error and re-attach again :
Server: Msg 823, Level 24, State 2, Line 1
I/O error (torn page) detected during read at offset 0x00000004142000 in file 'c:\Program Files\Microsoft SQL Server\MSSQL\Data\edoc_data.mdf'.
This is the second issue pertaining to MS SQL Server 2000. Actually, the server is trying to redone or rollback, depends on transaction condition at the time of crash. But, alas, the page is already damaged and have the torn page checking enabled, this again, will tripped the torn page error, unless something has to be done to fix the m_tornbits value.
Again, I am fixing the m_tornbits value, record the damaged offset (0x4142000) for later examination, and re-attach the database. This time, the database can be successfully attached with the messages below :
604 transactions rolled forward in database 'edoc' (26).
0 transactions rolled back in database 'edoc' (26).
Now, it's time to examine the damaged page(s) using the log file.
DBCC TRACEON(3604)
dbcc log(edoc, 1)
Instead of providing me with the log data before the database is crashed, I have only practically empty log data to be worked with :
00000ec1:0000011a:0001 LOP_BEGIN_CKPT
00000ec1:0000011a:0002 LOP_MODIFY_ROW
00000ec1:0000011b:0001 LOP_END_CKPT
It seems that every time the database is detached, then re-attached, the server is resetting the position of log recording, using the term 'Checkpoint'.
By using option 2 of DBCC LOG command, it is possible to view the previous LSN value :
DBCC TRACEON(3604)
dbcc log(edoc, 2)
For example, the previous LSN for 00000ec1:0000011a:0001 is 00000ec1:00000117:0003. So, is it possible, using DBCC LOG to view the previous LSN ?
DBCC TRACEON(3604)
dbcc log(edoc, 2, 'lsn', '0xec1:117:3')
Nope, it can't. This became the third issue of MS SQL Server 2000. It can only filter the lsn from the last checkpoint onward, as proved by this statement :
DBCC TRACEON(3604)
dbcc log(edoc, 2, 'lsn', '0xec1:11a:2')
It will give :
00000ec1:0000011a:0002
00000ec1:0000011b:0001
This is also to prove that the above syntax to view the previous LSN is not a garbage one, but a valid syntax. The lsn option in DBCC LOG is clearly not intended to be used to view any lsn. So, there is no starting lsn option in DBCC LOG.
Also, because of this kind of checkpointing mechanism, the information about the last checkpoint before the crash is also lost :
DBCC TRACEON(3604)
DBCC PAGE ('edoc', 1, 9, 3)
dbi_checkptLSN
--------------
m_fSeqNo = 3777 m_blockOffset = 282 m_slotId = 1
Converting the above value to hex value yield 0xEC1:11A:1 which is equal to the first record of log above. So, what is the checkpoint value at the time of crash ? Is it the same with previous LSN recorded in the active portion of log shown above ? (i.e. 00000ec1:00000117:0003).
To get those value and to prove it, I am marking the current database to emergency mode :
Sp_configure 'allow updates', 1
Reconfigure with override
Marking the database as emergency mode or bypass mode :
update master..sysdatabases set status = 32768 where name = 'edoc'
After this, I am shutting down the MS SQL Server, re-copy the database from damaged one, and restart the server. This will bypass the recovery and allow me to view the data as it is :
2008-07-03 17:01:50.80 spid8 Bypassing recovery for database 'edoc' because it is marked BYPASS.
DBCC TRACEON(3604)
DBCC PAGE ('edoc', 1, 9, 3)
dbi_checkptLSN
--------------
m_fSeqNo = 3776 m_blockOffset = 108 m_slotId = 6
Again, by converting the above value to hex equals 00000ec0:000006c:0006 which is NOT equals to 00000ec1:00000117:0003.
Now, is it possible, in emergency mode, to view the log from this checkpoint onward ?
DBCC TRACEON(3604)
dbcc log(edoc, 2)
Again, nope, it is impossible :
2008-07-03 16:31:01.98 spid52 Cannot do a dbcc log() on a database in emergency mode
This, if I am not mistaken, this will be the fourth issue in MS SQL Server 2000. In my opinion, the reason behind this error, is that in emergency or bypass mode, the log data is not initialized, and any attempt to view the log file is irrelevant :
DBCC TRACEON(3604)
DBCC DBTABLE(edoc)
LogTruncMgr @0x19FF10E8
-----------------------
m_replLSN = (0:0:0) m_ckptLSN = (0:0:0) m_oldActXact = (0:0:0)
m_backupLSN = (0:0:0) m_oldestBackupXactLSN = (0:0:0)
So, I have the dead end situation. In order to breaktrough this barrier, I have to go back to the previous fixed database, and this will initialized the log, and make the DBCC LOG operation to read the intended LSN (i.e 00000ec0:000006c:0006).
To make this post short, this is done by breaking to the DBCC LOG routine using WinDBG, and changing the in memory value of the current LSN to the LSN to be read. The detail of how this is done will be the subject of my next post.
Before the re-copying to set the database to emergency mode, I already make a backup copy so that I can step back. For stepping back, I reset the database to online mode, shut down the server, re-copy the database of previous step, and restart the server.
After that, firing the WinDBG, attach the sqlservr.exe program, and break into the DBCC LOG, setting the in memory of LSN to be read to 00000ec0:000006c:0006 and voila ! I have the log file for crosschecking to the damaged MDF file.
For example, by converting the error page at offset 0x4142000 to m_pageId = 1:8393, which is in hex value is 1:20A1, I can track the log file, looking for any operation at those page, which is, subject of my next posting.
So, keep in touch for my next two posting :)
Thursday, July 3, 2008
Saturday, June 21, 2008
How to Suppress Transaction Log Checking
In MS SQL Server, Transaction Log refers to the mechanism of undo and redo the data in the database. Sometime, in case of database crash, we just one to be able to view the data as it is so the recovery step can be carefully orchestrated. In this case, we have to find some ways to suppress undo/redo mechanism, if this became issue that crop up in data recovery session.
In the event of database crash, MS SQL server will first marked the database as suspect. If we then detach the suspect database, and if crashes affect the important pages such as PFS page, then it is impossible to re-attach the database again, unless some supression of checking mechanism is to be done on that database.
Recall that in my previous post, it is possible to bypass the checking mechanism by performing 'page transplantation' from backup database. But we are in big trouble if there are no backup available, i.e. we just have the crashed database and can not do anything about it because it can not be attached to be examined carefully (i.e. by DBCC PAGE command).
The illustrated case below will explain why this kind of suppresion is important.
I was faced with crashed database with no backup available. As usual, the database is marked as suspect (greyed icon), and after some futile attempt to do the recovery in order that the database is up again, I detach the database, upon trying to attach the database again I came upon this message :
Recall that in my previous post, I already describe the way to 'fix' the torn page by examining and then synchronizing the sectors in that page. After examining each sector, it revealed that the m_tornbits (red box) value is not in synch with the rest of sectors' torn bits.
One of the torn bits of sample sector is shown here (green box) :
All of the torn bits in the end of each sectors evaluated to 0x02 where as the m_tornbits value is 0x01 which is why MS SQL server give the above error.
To get rid of this nasty torn page error that preventing me to attach the database, I change the m_tornbits value from 0x01 to 0x02, and re-attach the database again using SQL Enterprise Manager. Then, I arrived with this message :
It seems that I already bypass the torn page checking, but tripped with another checks that preventing successful attach process. Upon examining the error log :
2008-06-21 15:10:35.43 spid51 Error while redoing logged operation in database 'MBS'. Error at log record ID (2387:408:4)
So, I am faced with transaction log issue. It seems that MS SQL server is trying to redo the transaction on this page, that is already corrupted, which is fortunately, unsuccesful, because even if it is successful, it will make things more complicated. It is because we only want to check the database as it is, seconds before disaster.
In order to bypass this mechanism, now it is time to consult the 'Oracle' Kalen Delaney (Inside Microsoft SQL Server 2000) again about transaction log :
If the LSN in the page is less than that of log LSN, then the transaction should be redone. Otherwise, if LSN in the page is greater, transaction should not be redone.
This 'law' implied that it is possible to bypass the redo mechanism, because this is the exact thing I expected. By somehow change the LSN in that page to be of higher value, then by theory, it will be able to bypass the redo mechanism.
If you see the page header in the above picture, the yellow box denotes the LSN of that defected page. Hence, by changing the value of 2387 = 0x00000953 to for example 2388 = 0x00000954 (to 0x54090000 in big endian in physical file), it is possible to bypass the transaction redone mechanism. I change the LSN value, firing up the attach database dialogue box, and bingo ! The database can be attached !
Now, with database successfully attached, I can examine more closely which data is corrupted.
Happy transaction log-ing :)
In the event of database crash, MS SQL server will first marked the database as suspect. If we then detach the suspect database, and if crashes affect the important pages such as PFS page, then it is impossible to re-attach the database again, unless some supression of checking mechanism is to be done on that database.
Recall that in my previous post, it is possible to bypass the checking mechanism by performing 'page transplantation' from backup database. But we are in big trouble if there are no backup available, i.e. we just have the crashed database and can not do anything about it because it can not be attached to be examined carefully (i.e. by DBCC PAGE command).
The illustrated case below will explain why this kind of suppresion is important.
I was faced with crashed database with no backup available. As usual, the database is marked as suspect (greyed icon), and after some futile attempt to do the recovery in order that the database is up again, I detach the database, upon trying to attach the database again I came upon this message :
Recall that in my previous post, I already describe the way to 'fix' the torn page by examining and then synchronizing the sectors in that page. After examining each sector, it revealed that the m_tornbits (red box) value is not in synch with the rest of sectors' torn bits.
One of the torn bits of sample sector is shown here (green box) :
All of the torn bits in the end of each sectors evaluated to 0x02 where as the m_tornbits value is 0x01 which is why MS SQL server give the above error.
To get rid of this nasty torn page error that preventing me to attach the database, I change the m_tornbits value from 0x01 to 0x02, and re-attach the database again using SQL Enterprise Manager. Then, I arrived with this message :
It seems that I already bypass the torn page checking, but tripped with another checks that preventing successful attach process. Upon examining the error log :
2008-06-21 15:10:35.43 spid51 Error while redoing logged operation in database 'MBS'. Error at log record ID (2387:408:4)
So, I am faced with transaction log issue. It seems that MS SQL server is trying to redo the transaction on this page, that is already corrupted, which is fortunately, unsuccesful, because even if it is successful, it will make things more complicated. It is because we only want to check the database as it is, seconds before disaster.
In order to bypass this mechanism, now it is time to consult the 'Oracle' Kalen Delaney (Inside Microsoft SQL Server 2000) again about transaction log :
If the LSN in the page is less than that of log LSN, then the transaction should be redone. Otherwise, if LSN in the page is greater, transaction should not be redone.
This 'law' implied that it is possible to bypass the redo mechanism, because this is the exact thing I expected. By somehow change the LSN in that page to be of higher value, then by theory, it will be able to bypass the redo mechanism.
If you see the page header in the above picture, the yellow box denotes the LSN of that defected page. Hence, by changing the value of 2387 = 0x00000953 to for example 2388 = 0x00000954 (to 0x54090000 in big endian in physical file), it is possible to bypass the transaction redone mechanism. I change the LSN value, firing up the attach database dialogue box, and bingo ! The database can be attached !
Now, with database successfully attached, I can examine more closely which data is corrupted.
Happy transaction log-ing :)
Thursday, June 19, 2008
Which Sector is "torn" in Torn Page Error ?
Recently I'm doing the analysis of the "torn page" thing again. If the SQL server throws the torn page error, it doesn't show which sector the page is torn. Let's review again the definition of "torn page" from Kalen Delaney's "Inside Microsoft SQL Server 2000" :
..it causes a bit to be flipped for each 512 byte sector in a database page (8 KB) whenever the page is written to disk..
In practice, there are two kinds of torn page error events. By this event, I am referring to the behaviour of MS SQL 2000.
First, torn page error during the database attachment process. Usually, this is caused by some corrupted data that occur in MS SQL server's important pages, such as PFS pages.
Second, torn page during accessing records, i.e. ours important pages :)
Since the data is of utmost important, the precise determination of which sector of size 512 bytes that gets torn apart is required.
To perform this kind of analysis, I am creating the testing database (named Test), create table and input some record using SQL Enterprise Manager. Next, I am determining the first page that contain my data :
use test
dbcc checkalloc
***************************************************************
Table TestTbl Object ID 357576312.
Index ID 0. FirstIAM (1:91). Root (1:90). Dpages 1.
Index ID 0. 2 pages used in 0 dedicated extents.
Total number of extents is 0.
***************************************************************
So, the first page of my data is located at page 90 which is at offset 90 * 8192 = 737280 = 0xB4000 in physical file.
Now, lets see the physical page created by MS SQL Server using hex editor :
Here, the offset 5 (marked with green box) in the page header is somehow related to the torn page checking. Turning the first nibble to zero (i.e. from 0x81 to 0x80) should bypass the torn page checking. But it is not, see the paragraph below for further information.
OK, let's go back to the original question. To determine which sector is get torn, we have to examine the calculation of m_tornbits flag in relation to the last offset at each 512 sector. Here is how it is done :
First byte of m_tornbits is masked with 0x03, so any byte will end up with either the value 0x00, 0x01, 0x02 or 0x03. For the above example, first byte is 0x01, with this operation, it will give 0x01. This value is used to compare with the last byte offset of second sector onward (or sector 1, if I count first sector as zero) (starting from offset 0x03FF onward).
The last byte of each sector is performed using the same operation as above and compared with the value from masked first byte of m_tornbits. If there are any differences, the torn page error will be thrown.
So, using the above operation, it is possible to determine which sector is not in synch with the rest of the sector or in other word, which sector is a torn page. But, it is inevitable with the assumption that, the first sector which contains the page header, hence the m_tornbits value, is a valid one.
However, it is not possible to temporarily turn off torn page checking by resetting the byte in offset 5 at page header. SQL server seems perform the consistency checking :
So, you should change the last offset of defected sector one by one in order to be accessed by MS SQL Server.
Happy torn page-ing :)
..it causes a bit to be flipped for each 512 byte sector in a database page (8 KB) whenever the page is written to disk..
In practice, there are two kinds of torn page error events. By this event, I am referring to the behaviour of MS SQL 2000.
First, torn page error during the database attachment process. Usually, this is caused by some corrupted data that occur in MS SQL server's important pages, such as PFS pages.
Second, torn page during accessing records, i.e. ours important pages :)
Since the data is of utmost important, the precise determination of which sector of size 512 bytes that gets torn apart is required.
To perform this kind of analysis, I am creating the testing database (named Test), create table and input some record using SQL Enterprise Manager. Next, I am determining the first page that contain my data :
use test
dbcc checkalloc
Table TestTbl Object ID 357576312.
Index ID 0. FirstIAM (1:91). Root (1:90). Dpages 1.
Index ID 0. 2 pages used in 0 dedicated extents.
Total number of extents is 0.
***************************************************************
So, the first page of my data is located at page 90 which is at offset 90 * 8192 = 737280 = 0xB4000 in physical file.
Now, lets see the physical page created by MS SQL Server using hex editor :
Here, the offset 5 (marked with green box) in the page header is somehow related to the torn page checking. Turning the first nibble to zero (i.e. from 0x81 to 0x80) should bypass the torn page checking. But it is not, see the paragraph below for further information.
OK, let's go back to the original question. To determine which sector is get torn, we have to examine the calculation of m_tornbits flag in relation to the last offset at each 512 sector. Here is how it is done :
First byte of m_tornbits is masked with 0x03, so any byte will end up with either the value 0x00, 0x01, 0x02 or 0x03. For the above example, first byte is 0x01, with this operation, it will give 0x01. This value is used to compare with the last byte offset of second sector onward (or sector 1, if I count first sector as zero) (starting from offset 0x03FF onward).
The last byte of each sector is performed using the same operation as above and compared with the value from masked first byte of m_tornbits. If there are any differences, the torn page error will be thrown.
So, using the above operation, it is possible to determine which sector is not in synch with the rest of the sector or in other word, which sector is a torn page. But, it is inevitable with the assumption that, the first sector which contains the page header, hence the m_tornbits value, is a valid one.
However, it is not possible to temporarily turn off torn page checking by resetting the byte in offset 5 at page header. SQL server seems perform the consistency checking :
So, you should change the last offset of defected sector one by one in order to be accessed by MS SQL Server.
Happy torn page-ing :)
Friday, June 6, 2008
Reverse Engineering Lotus Notes
The above title will provoke some hard criticisms, or even some warning letters sent to me to delete, or maybe otherwise, instruct the company that host my web site be blocked.
Some will opinioned that I has perform something forbidden by law. But as I already written on previous posts in this blog, actually all human activities for the namesake to increase the prosperity and better living can be categorized as many forms of reverse engineering activities.
Ok, enough of the preaching. Let's back to the above topic. Now, the question is, is it possible ? The very first step in reverse engineering is to identify the function name. This is important because the name will convey some internal working intended by the programmer or developer of those functions.
Is it possible to identify the function name in the myriads of hex codes of calls in assemblies of Lotus Notes applications ? I would like to say, it is possible.
First of all, I will define the scope of my discussion to Lotus Notes that is running using Windows OS, in other word, windows version of Lotus Notes. And I am not discussing the server site runtime routine, but the client side.
So, let's start with the beginning when the Lotus Notes program is invoked via windows icon. In this discussion, I am using Lotus Notes version 7. Executable file that spawned the Lotus Notes Client is NLNOTES.EXE.
Using the WinDBG debugger, we can determine the entry point, i.e. where all of the Lotus Notes routine resides, so we don't get lost into identifying the windows runtime instead of Lotus Notes runtime.
Executable search path is:
ModLoad: 00400000 0050f000 nlnotes.exe
ModLoad: 7c900000 7c9b0000 ntdll.dll
Here, at my current computer memory configuration, the NLNOTES.EXE file is loaded at hex address 400000. We can determine the address of entry point using the !dh macro in WinDBG :
0:000> !dh 00400000
File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
14C machine (i386)
4 number of sections
4304A8DE time date stamp Thu Aug 18 22:27:26 2005
0 file pointer to symbol table
0 number of symbols
E0 size of optional header
10F characteristics
Relocations stripped
Executable
Line numbers stripped
Symbols stripped
32 bit word machine
OPTIONAL HEADER VALUES
10B magic #
7.10 linker version
1000 size of code
10D000 size of initialized data
0 size of uninitialized data
1C94 address of entry point
1000 base of code
----- new -----
00400000 image base
1000 section alignment
1000 file alignment
2 subsystem (Windows GUI)
4.00 operating system version
0.00 image version
4.00 subsystem version
10F000 size of image
1000 size of headers
...
Searching the "address of entry point" string in the result gives hex 1C94. So, we can calculate the actual address in the memory where the entry point reside :
400000 + 1C94 = 401C94
Now, we can perform break point and goes directly to the entry point of Lotus Notes Application :
0:000> bp 00401c94
0:000> g
ModLoad: 5cb70000 5cb96000 C:\WINDOWS\system32\ShimEng.dll
ModLoad: 76390000 763ad000 C:\WINDOWS\system32\IMM32.DLL
ModLoad: 629c0000 629c9000 C:\WINDOWS\system32\LPK.DLL
ModLoad: 74d90000 74dfb000 C:\WINDOWS\system32\USP10.dll
ModLoad: 773d0000 774d3000 C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common-Controls_6595b64144ccf1df_6.0.2600.2982_x-ww_ac3f9c03\comctl32.dll
Breakpoint 0 hit
eax=00000000 ebx=7ffdd000 ecx=0013ffb0 edx=7c90eb94 esi=7c9118f1 edi=00011970
eip=00401c94 esp=0013ffc4 ebp=0013fff0 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
nlnotes+0x1c94:
00401c94 6a74 push 74h
0:000> u 401c94
*** WARNING: Unable to verify checksum for nlnotes.exe
*** ERROR: Module load completed but symbols could not be loaded for nlnotes.exe
nlnotes+0x1c94:
00401c94 6a74 push 74h
00401c96 68c0214000 push offset nlnotes+0x21c0 (004021c0)
00401c9b e8e8020000 call nlnotes+0x1f88 (00401f88)
00401ca0 33db xor ebx,ebx
00401ca2 895de0 mov dword ptr [ebp-20h],ebx
00401ca5 53 push ebx
00401ca6 8b3d24204000 mov edi,dword ptr [nlnotes+0x2024 (00402024)]
00401cac ffd7 call edi
Here is our sample Lotus Notes function to be identified :
00401c9b e8e8020000 call nlnotes+0x1f88 (00401f88)
In Lotus Notes 7 client application, usually resides in C:\Lotus\Notes directory, there is the file called LotusNotes.sym file in compressed format. The first thing to do is to de-compressed this file using the MAP2ISYM.EXE :
For the one who is curious, MAP2ISYM.EXE can be obtained from Lotus Notes official sites.
Next, we have to determine the start and end offset of NLNOTES.EXE module, so we can avoid wrong identification of function names.
This is the portion of LotusNotes.sym raw data shown using hex file that determine the start offset and end offset of NLNOTES.EXE module :
Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F
00000650 04 43 00 00 40 00 37 A8 B7 00 07 00 00 00 4E 4C C @ 7¨· NL
00000660 4E 4F 54 45 53 7F 5D 00 43 00 00 C5 62 F4 B8 B7 NOTES] C Åbô¸·
00000670 00 08 00 00 00 4E 4C 53 43 43 53 54 52 59 61 00 NLSCCSTRYa
00000680 43 C
The hex marked with red and blue one denotes the start and end offset of NLNOTES.EXE module which is B7A837 and B7B8F4 respectively. So, the relative address for NLNOTES.EXE should resides in this range.
For the above call, the relative address is 1F88, because base address is already determined, which is 400000.
This relative address should be transformed to 00001F88 and transposed become big endian format 88 1F 00 00. This value could be used to search using hex editor :
Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F
00B7AB40 84 1F 00 00 0A 00 5F 5F 73 65 74 61 72 67 76 00 „ __setargv
00B7AB50 88 0F 00 00 88 1F 00 00 0D 00 5F 5F 53 45 48 5F ˆ ˆ __SEH_
00B7AB60 70 72 6F 6C 6F 67 00 C3 0F 00 00 C3 1F 00 00 0D prolog à Ã
The red one denotes the translation between the relative to the function name, and the offset at location in the vicinity of 00B7AB50 is still well within the defined boundary of start and end offset of NLNOTES.EXE module. So, rest assured, we can deduct that the function name should be _SEH_prolog :
00401c94 6a74 push 74h
00401c96 68c0214000 push offset nlnotes+0x21c0 (004021c0)
00401c9b e8e8020000 call nlnotes+0x1f88 (00401f88) ;;_SEH_prolog
Using this manual method, we can theoritically determine any function name of interest. And this conclude the answer of the possibility of reverse engineering Lotus Notes application, particularly determining the function name.
As for the side comment, the identified function in this sample case (_SEH_prolog), although already reside in NLNOTES.EXE module, is still part of windows OS runtime. Maybe this executable is generated using Microsoft's Visual C++.
Happy identifying :)
Some will opinioned that I has perform something forbidden by law. But as I already written on previous posts in this blog, actually all human activities for the namesake to increase the prosperity and better living can be categorized as many forms of reverse engineering activities.
Ok, enough of the preaching. Let's back to the above topic. Now, the question is, is it possible ? The very first step in reverse engineering is to identify the function name. This is important because the name will convey some internal working intended by the programmer or developer of those functions.
Is it possible to identify the function name in the myriads of hex codes of calls in assemblies of Lotus Notes applications ? I would like to say, it is possible.
First of all, I will define the scope of my discussion to Lotus Notes that is running using Windows OS, in other word, windows version of Lotus Notes. And I am not discussing the server site runtime routine, but the client side.
So, let's start with the beginning when the Lotus Notes program is invoked via windows icon. In this discussion, I am using Lotus Notes version 7. Executable file that spawned the Lotus Notes Client is NLNOTES.EXE.
Using the WinDBG debugger, we can determine the entry point, i.e. where all of the Lotus Notes routine resides, so we don't get lost into identifying the windows runtime instead of Lotus Notes runtime.
Executable search path is:
ModLoad: 00400000 0050f000 nlnotes.exe
ModLoad: 7c900000 7c9b0000 ntdll.dll
Here, at my current computer memory configuration, the NLNOTES.EXE file is loaded at hex address 400000. We can determine the address of entry point using the !dh macro in WinDBG :
0:000> !dh 00400000
File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
14C machine (i386)
4 number of sections
4304A8DE time date stamp Thu Aug 18 22:27:26 2005
0 file pointer to symbol table
0 number of symbols
E0 size of optional header
10F characteristics
Relocations stripped
Executable
Line numbers stripped
Symbols stripped
32 bit word machine
OPTIONAL HEADER VALUES
10B magic #
7.10 linker version
1000 size of code
10D000 size of initialized data
0 size of uninitialized data
1C94 address of entry point
1000 base of code
----- new -----
00400000 image base
1000 section alignment
1000 file alignment
2 subsystem (Windows GUI)
4.00 operating system version
0.00 image version
4.00 subsystem version
10F000 size of image
1000 size of headers
...
Searching the "address of entry point" string in the result gives hex 1C94. So, we can calculate the actual address in the memory where the entry point reside :
400000 + 1C94 = 401C94
Now, we can perform break point and goes directly to the entry point of Lotus Notes Application :
0:000> bp 00401c94
0:000> g
ModLoad: 5cb70000 5cb96000 C:\WINDOWS\system32\ShimEng.dll
ModLoad: 76390000 763ad000 C:\WINDOWS\system32\IMM32.DLL
ModLoad: 629c0000 629c9000 C:\WINDOWS\system32\LPK.DLL
ModLoad: 74d90000 74dfb000 C:\WINDOWS\system32\USP10.dll
ModLoad: 773d0000 774d3000 C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common-Controls_6595b64144ccf1df_6.0.2600.2982_x-ww_ac3f9c03\comctl32.dll
Breakpoint 0 hit
eax=00000000 ebx=7ffdd000 ecx=0013ffb0 edx=7c90eb94 esi=7c9118f1 edi=00011970
eip=00401c94 esp=0013ffc4 ebp=0013fff0 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
nlnotes+0x1c94:
00401c94 6a74 push 74h
0:000> u 401c94
*** WARNING: Unable to verify checksum for nlnotes.exe
*** ERROR: Module load completed but symbols could not be loaded for nlnotes.exe
nlnotes+0x1c94:
00401c94 6a74 push 74h
00401c96 68c0214000 push offset nlnotes+0x21c0 (004021c0)
00401c9b e8e8020000 call nlnotes+0x1f88 (00401f88)
00401ca0 33db xor ebx,ebx
00401ca2 895de0 mov dword ptr [ebp-20h],ebx
00401ca5 53 push ebx
00401ca6 8b3d24204000 mov edi,dword ptr [nlnotes+0x2024 (00402024)]
00401cac ffd7 call edi
Here is our sample Lotus Notes function to be identified :
00401c9b e8e8020000 call nlnotes+0x1f88 (00401f88)
In Lotus Notes 7 client application, usually resides in C:\Lotus\Notes directory, there is the file called LotusNotes.sym file in compressed format. The first thing to do is to de-compressed this file using the MAP2ISYM.EXE :
For the one who is curious, MAP2ISYM.EXE can be obtained from Lotus Notes official sites.
Next, we have to determine the start and end offset of NLNOTES.EXE module, so we can avoid wrong identification of function names.
This is the portion of LotusNotes.sym raw data shown using hex file that determine the start offset and end offset of NLNOTES.EXE module :
Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F
00000650 04 43 00 00 40 00 37 A8 B7 00 07 00 00 00 4E 4C C @ 7¨· NL
00000660 4E 4F 54 45 53 7F 5D 00 43 00 00 C5 62 F4 B8 B7 NOTES] C Åbô¸·
00000670 00 08 00 00 00 4E 4C 53 43 43 53 54 52 59 61 00 NLSCCSTRYa
00000680 43 C
The hex marked with red and blue one denotes the start and end offset of NLNOTES.EXE module which is B7A837 and B7B8F4 respectively. So, the relative address for NLNOTES.EXE should resides in this range.
For the above call, the relative address is 1F88, because base address is already determined, which is 400000.
This relative address should be transformed to 00001F88 and transposed become big endian format 88 1F 00 00. This value could be used to search using hex editor :
Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F
00B7AB40 84 1F 00 00 0A 00 5F 5F 73 65 74 61 72 67 76 00 „ __setargv
00B7AB50 88 0F 00 00 88 1F 00 00 0D 00 5F 5F 53 45 48 5F ˆ ˆ __SEH_
00B7AB60 70 72 6F 6C 6F 67 00 C3 0F 00 00 C3 1F 00 00 0D prolog à Ã
The red one denotes the translation between the relative to the function name, and the offset at location in the vicinity of 00B7AB50 is still well within the defined boundary of start and end offset of NLNOTES.EXE module. So, rest assured, we can deduct that the function name should be _SEH_prolog :
00401c94 6a74 push 74h
00401c96 68c0214000 push offset nlnotes+0x21c0 (004021c0)
00401c9b e8e8020000 call nlnotes+0x1f88 (00401f88) ;;_SEH_prolog
Using this manual method, we can theoritically determine any function name of interest. And this conclude the answer of the possibility of reverse engineering Lotus Notes application, particularly determining the function name.
As for the side comment, the identified function in this sample case (_SEH_prolog), although already reside in NLNOTES.EXE module, is still part of windows OS runtime. Maybe this executable is generated using Microsoft's Visual C++.
Happy identifying :)
Thursday, June 5, 2008
Unraveling MS SQL 2000 Database Format (Part 1)
This time, I would like to discuss about the physical format of Microsoft's SQL 2000 Database. As you probably already know, each MS SQL 2000 database corresponds to one or more file. The file has the default extention of .MDF.
The raw data of MDF file is logically divided by blocks of size 8192 bytes and it is called page. In each block, the first 96 bytes is the header, and the rest of it, is the data, depends of the page type that is defined in the page header.
Now, let's see the first block, which corresponds to the page 0 (zero) in some sample physical SQL 2000 database, using hex editor program :
Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F
00000000 01 0F 00 00 08 01 00 00 00 00 00 00 00 00 00 00
00000010 00 00 00 00 00 00 01 00 63 00 00 00 01 1F DE 18 c Þ
00000020 00 00 00 00 01 00 00 00 67 00 00 00 78 01 00 00 g x
00000030 13 00 00 00 00 00 00 00 00 00 00 00 42 D2 00 00 BÒ
00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Please by reminded that I will not explain further about each variable in each bytes (for instance, I will not explain what is m_type, etc.). You can consult other documentation regarding these property.
Offset 0 which is Hex 01 is headerVersion
Offset 1 which is Hex 0F is Page Type (m_type)
Offset 2 which is Hex 00 is m_typeFlagBits
Offset 3 which is Hex 00 is m_level
Offset 4 which is Hex 08 is m_flagBits
Offset 5 which is Hex 01 is at the time of writing is still not yet identified
Offset 6-7 which is Hex 00 00 is m_indexId
Offset 18-1B which is Hex 63 00 00 00 which is 00000063 Hex which is 99 Dec is m_objId
Offset 20-23 which is Hex 00 00 00 00 is second part of m_PageId
Offset 24-25 which is Hex 01 00 is first part of m_PageId or m_FileId
If time permitted, I will continue with next deciphering task. That's all folks !
The raw data of MDF file is logically divided by blocks of size 8192 bytes and it is called page. In each block, the first 96 bytes is the header, and the rest of it, is the data, depends of the page type that is defined in the page header.
Now, let's see the first block, which corresponds to the page 0 (zero) in some sample physical SQL 2000 database, using hex editor program :
Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F
00000000 01 0F 00 00 08 01 00 00 00 00 00 00 00 00 00 00
00000010 00 00 00 00 00 00 01 00 63 00 00 00 01 1F DE 18 c Þ
00000020 00 00 00 00 01 00 00 00 67 00 00 00 78 01 00 00 g x
00000030 13 00 00 00 00 00 00 00 00 00 00 00 42 D2 00 00 BÒ
00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Please by reminded that I will not explain further about each variable in each bytes (for instance, I will not explain what is m_type, etc.). You can consult other documentation regarding these property.
Offset 0 which is Hex 01 is headerVersion
Offset 1 which is Hex 0F is Page Type (m_type)
Offset 2 which is Hex 00 is m_typeFlagBits
Offset 3 which is Hex 00 is m_level
Offset 4 which is Hex 08 is m_flagBits
Offset 5 which is Hex 01 is at the time of writing is still not yet identified
Offset 6-7 which is Hex 00 00 is m_indexId
Offset 18-1B which is Hex 63 00 00 00 which is 00000063 Hex which is 99 Dec is m_objId
Offset 20-23 which is Hex 00 00 00 00 is second part of m_PageId
Offset 24-25 which is Hex 01 00 is first part of m_PageId or m_FileId
If time permitted, I will continue with next deciphering task. That's all folks !
Wednesday, June 4, 2008
Reverse Engineering the Adobe Acrobat Applications
By limiting the scope to the computer technology, especially to programming languages, I personally considered Reverse Engineering as a means to convey more meaning to the otherwise lifeless assembly codes that is consists of numbers in hexadecimal format.
Putting into more broader scope, I believe that every human searches for the meaning of life, every human pursues in each aspect of sciences is also a reverse engineering activities, also tries to convey meaning of the mechanism or inner workings of universe.
So, based on the above title, I try to convey some meaning of the existing assembly codes of existing Adobe plug-ins. The most basic aspect is to find or identify functions that is used by any Acrobat Plug-ins.
Based on the documentations given by Adobe, the concept of function calling in the Acrobat framework is using HFTs (Host Function Tables). If I created some plug-in, I can also, exports the HFTs, so when it is well documented, other plug-ins can use the functions in my plug-ins to perform some intended operations.
To prove the Acrobat concept of the HFT mechanism, let us examined the smallest prototype plug-in supplied by Adobe in the SDK, named starter.api. After compiling it, then disasembled it, I am using DISASM.EXE provided by SangCho. On the original source, the starter.api is calling the most used Acrobat function called ASAtomFromString. In the compiled form, it is taking the form of :
:1001266A 68E4410210 push 100241E4
(StringData)"ADBE:Starter"
:1001266F 8B1520850210 mov edx, dword[10028520 {gCoreHFT} ]
:10012675 FF5214 call dword[edx+14]
Surely, the edx register contains the index to the HFT, in this case is gCoreHFT. Now the hexadecimal value 14, or 20 in decimal should corresponds the enumerated index.
The calling methods is using indexes of the enumerated types of function name plus the SEL suffix. For example, the enumerated name of ASAtomFromString will be ASAtomFromStringSEL. This enumerated names is supposed to be existent in the header files supplied by Adobe in the SDK.
But if you try to search the string "ASAtomFromStringSEL" into any of the header files, you will find nothing. Why ? Because it is sophistically constructed using definition macro. So, there is no way you can find the value of enumerated name based on existing header files.
Some compilers have the facillity to generate the pre-compiled header, i.e. file with .PCH extension. And if you want to find, you should consult this file, but, again, unfortunately, it is in the form of machine language that human can not understand. I don't know whether currently there a some program to decompile the .PCH file down to readable definitions out there.
So, the most easiest method is assigned the ASAtomFromStringSEL to some value in the plug-in we have, in this case is starter.api, and have it inspected during debugging session, or have it printed or written to some temporary file. Using this method, we can find that ASAtomFromStringSEL corresponds to value 5. And the relation of 5 with 20 is 20/4 or divided by 4. Why 4 ? Because each index is a multiply of words.
So, provided we have all the enumerated values in each of HFT index category, such as Core, AcroSupport, etc, theoritically we can identify any HFTs in the plug-ins, hence, serve the purpose to convey some meaning to the lifeless assembly codes in the Adobe plug-ins, the Reverse Engineering.
Putting into more broader scope, I believe that every human searches for the meaning of life, every human pursues in each aspect of sciences is also a reverse engineering activities, also tries to convey meaning of the mechanism or inner workings of universe.
So, based on the above title, I try to convey some meaning of the existing assembly codes of existing Adobe plug-ins. The most basic aspect is to find or identify functions that is used by any Acrobat Plug-ins.
Based on the documentations given by Adobe, the concept of function calling in the Acrobat framework is using HFTs (Host Function Tables). If I created some plug-in, I can also, exports the HFTs, so when it is well documented, other plug-ins can use the functions in my plug-ins to perform some intended operations.
To prove the Acrobat concept of the HFT mechanism, let us examined the smallest prototype plug-in supplied by Adobe in the SDK, named starter.api. After compiling it, then disasembled it, I am using DISASM.EXE provided by SangCho. On the original source, the starter.api is calling the most used Acrobat function called ASAtomFromString. In the compiled form, it is taking the form of :
:1001266A 68E4410210 push 100241E4
(StringData)"ADBE:Starter"
:1001266F 8B1520850210 mov edx, dword[10028520 {gCoreHFT} ]
:10012675 FF5214 call dword[edx+14]
Surely, the edx register contains the index to the HFT, in this case is gCoreHFT. Now the hexadecimal value 14, or 20 in decimal should corresponds the enumerated index.
The calling methods is using indexes of the enumerated types of function name plus the SEL suffix. For example, the enumerated name of ASAtomFromString will be ASAtomFromStringSEL. This enumerated names is supposed to be existent in the header files supplied by Adobe in the SDK.
But if you try to search the string "ASAtomFromStringSEL" into any of the header files, you will find nothing. Why ? Because it is sophistically constructed using definition macro. So, there is no way you can find the value of enumerated name based on existing header files.
Some compilers have the facillity to generate the pre-compiled header, i.e. file with .PCH extension. And if you want to find, you should consult this file, but, again, unfortunately, it is in the form of machine language that human can not understand. I don't know whether currently there a some program to decompile the .PCH file down to readable definitions out there.
So, the most easiest method is assigned the ASAtomFromStringSEL to some value in the plug-in we have, in this case is starter.api, and have it inspected during debugging session, or have it printed or written to some temporary file. Using this method, we can find that ASAtomFromStringSEL corresponds to value 5. And the relation of 5 with 20 is 20/4 or divided by 4. Why 4 ? Because each index is a multiply of words.
So, provided we have all the enumerated values in each of HFT index category, such as Core, AcroSupport, etc, theoritically we can identify any HFTs in the plug-ins, hence, serve the purpose to convey some meaning to the lifeless assembly codes in the Adobe plug-ins, the Reverse Engineering.
Friday, February 1, 2008
How to do Recovery on MS SQL Server Database ?
One day, our server that hosts MS SQL is crashed. May be this is caused by some worms that infected the Windows system, or maybe it is caused by crashed in physical media storage, or maybe whatever.
Upon restarted, about 40% instances of the database collection in MS SQL Server (we are using SQL 2000) turned gray or suspected. So I choosed one of the database to be examined.
After many failed attempt to access the data in one instanced of the grayed database, I tried to detach this database. When tried to attach it again, I got this message :
So the reason that this database is grayed out in SQL Enterprise Manager (EM) is because of this problem. Next, I am trying to do this using the stored procedure "sp_attach_db" in Query Analyzer :
sp_attach_db 'edoc_test3', 'c:\mydata\edoc_data.mdf'
It gives the following error message :
Server: Msg 823, Level 24, State 2, Line 1
I/O error (torn page) detected during read at offset 0x00000000012000 in file 'c:\mydata\edoc_data.mdf'.
Connection Broken
It says there is torn page on offset 12000 hex on this file. I don't want to elaborate about this "torn page" things that already torn apart many lives that depends on MS SQL Server, rather than to say that SQL refuses to accept the defected database because some small parts of it is damaged.
The greyed status (suspect) and this problem is mean the same thing, i.e. I can't perform DBCC CHECKDB or DBCC PAGE on either the greyed (suspect) database and when the database is not attached.
In the above case, the error message given by Query Analyzer is more specific than Enterprise Manager.
The idea is to find out which page is damaged and replaced it with clean one. This technics will pass the preliminary checking of database during attach process so that the database can, at least be attached and the status icon turned yellow. Yellow means we can perform DBCC CHECKDB and DBCC PAGE.
We remember that from MS SQL Documentation of internal storage structure, the physical file is organized on blocks with size of 8192 bytes. These blocks is called pages and it is start from 0 to n.
And we also have to derive the exact location in the file where the page is defected. From the above information, offset 12000 hex = 73728 dec offset in the file.
Using the raw hexadecimal editor as WinHex, we can perform the 'surgery' of this page, provided that we have the backup of the good one. The back up of the good one have to be as close as the damaged one, the more close, the better.
From the good one, we go to the same offset, copy the 8192 bytes block and patch it to the same offset on the damaged one. This is performed with the hexa utility such as WinHex.
Repeat these steps to patch the damaged page with the good one from backup until the database can be successfully attached. Perform DBCC CHECKDB and DBCC PAGE to examine and extract the damaged data.
Some points to remember is to compare between good and damaged blocks.
Happy recoverying :)
Upon restarted, about 40% instances of the database collection in MS SQL Server (we are using SQL 2000) turned gray or suspected. So I choosed one of the database to be examined.
After many failed attempt to access the data in one instanced of the grayed database, I tried to detach this database. When tried to attach it again, I got this message :
So the reason that this database is grayed out in SQL Enterprise Manager (EM) is because of this problem. Next, I am trying to do this using the stored procedure "sp_attach_db" in Query Analyzer :
sp_attach_db 'edoc_test3', 'c:\mydata\edoc_data.mdf'
It gives the following error message :
Server: Msg 823, Level 24, State 2, Line 1
I/O error (torn page) detected during read at offset 0x00000000012000 in file 'c:\mydata\edoc_data.mdf'.
Connection Broken
It says there is torn page on offset 12000 hex on this file. I don't want to elaborate about this "torn page" things that already torn apart many lives that depends on MS SQL Server, rather than to say that SQL refuses to accept the defected database because some small parts of it is damaged.
The greyed status (suspect) and this problem is mean the same thing, i.e. I can't perform DBCC CHECKDB or DBCC PAGE on either the greyed (suspect) database and when the database is not attached.
In the above case, the error message given by Query Analyzer is more specific than Enterprise Manager.
The idea is to find out which page is damaged and replaced it with clean one. This technics will pass the preliminary checking of database during attach process so that the database can, at least be attached and the status icon turned yellow. Yellow means we can perform DBCC CHECKDB and DBCC PAGE.
We remember that from MS SQL Documentation of internal storage structure, the physical file is organized on blocks with size of 8192 bytes. These blocks is called pages and it is start from 0 to n.
And we also have to derive the exact location in the file where the page is defected. From the above information, offset 12000 hex = 73728 dec offset in the file.
Using the raw hexadecimal editor as WinHex, we can perform the 'surgery' of this page, provided that we have the backup of the good one. The back up of the good one have to be as close as the damaged one, the more close, the better.
From the good one, we go to the same offset, copy the 8192 bytes block and patch it to the same offset on the damaged one. This is performed with the hexa utility such as WinHex.
Repeat these steps to patch the damaged page with the good one from backup until the database can be successfully attached. Perform DBCC CHECKDB and DBCC PAGE to examine and extract the damaged data.
Some points to remember is to compare between good and damaged blocks.
Happy recoverying :)
Subscribe to:
Posts (Atom)