Wednesday, August 11, 2010

Debugging a Bugcheck 0x109

I want to share with you a recent experience where 64-bit Windows Server 2008 servers at a customer location were encountering bugcheck 0x109 blue screen crashes.
In 64-bit versions of the Windows kernel PatchGuard is present. If any driver or application attempts to modify the kernel the PatchGuard will generate the bugcheck (CRITICAL_STRUCTURE_CORRUPTION) mentioned below. PatchGuard protects the kernel from modification by malicious or badly written drivers or software.
To further investigate this bugcheck you need to compare the impacted kernel function with a known reliable one. For instance, if the machine encountering this was running Windows Server 2008 service pack 2 with a post SP2 hotfix kernel you need to compare the impacted kernel function with that of service pack 2 kernel function. Usually you do not need to download and extract the post SP2 hotfix, because the vast majority of the kernel code has not been modified since the service pack.
If you already have service pack 2 for Windows Server 2008 handy, expand the package using instructions included in KB928636:
Windows6.0-KB948465-X64.exe /x
expand.exe -f:* C:\WS08\SP2\windows6.0-kb948465-X64.cab C:\WS08\SP2\Expanded
Locate the kernel binary from the expanded binaries and then open it up with your debugger just like you open a crash memory dump.
windbg -z C:\WS08\SP2\Expanded\amd64_microsoft-windows-os-kernel_31bf3856ad364e35_6.0.6002.18005_none_ca3a763069a24eea\ntoskrnl.exe
This is the bugcheck data from the dump:
CRITICAL_STRUCTURE_CORRUPTION (109)
This bugcheck is generated when the kernel detects that critical kernel code or
data have been corrupted. There are generally three causes for a corruption:
1) A driver has inadvertently or deliberately modified critical kernel code
or data. See http://www.microsoft.com/whdc/driver/kernel/64bitPatching.mspx
2) A developer attempted to set a normal kernel breakpoint using a kernel
debugger that was not attached when the system was booted. Normal breakpoints,
"bp", can only be set if the debugger is attached at boot time. Hardware
breakpoints, "ba", can be set at any time.
3) A hardware corruption occurred, e.g. failing RAM holding kernel code or data.
Arguments:
Arg1: a3a039d89b456543, Reserved
Arg2: b3b7465eedc23277, Reserved
Arg3: fffff80001778470, Failure type dependent information
Arg4: 0000000000000001, Type of corrupted region, can be
       0 : A generic data region
       1 : Modification of a function or .pdata
       2 : A processor IDT
       3 : A processor GDT
       4 : Type 1 process list corruption
       5 : Type 2 process list corruption
       6 : Debug routine modification
       7 : Critical MSR modification
Next, check the address at Arg3. This will give you the function that was modified, but not the offset of the modified instruction.
Read more: NTDebugging