Debugging custom filters for unhandled exceptions
Updated: 09.10.2005
Introduction
Debugging custom filters
Somebody is overwriting my filter!
Enforcing your own filter
Sample code
Conclusion
Introduction
When our application crashes on the customer's site, we need as much information about the problem as possible. System tools like Dr. Watson can help to collect the necessary information, but their effectiveness depends on the configuration of the target system. If we do not want to depend on system configuration, custom filter for unhandled exceptions is a good solution. With the help of the custom filter, we can get notified about unhandled exceptions in the application, create detailed crash report, and sometimes even automatically send it to developers for investigation.
Unfortunately, custom filters for unhandled exceptions are not easy to debug. In addition, sometimes it can be difficult to ensure that our filter is properly registered, because other components of the application (including some system DLLs) might want to register their own filters. In this article, I will show how to overcome these problems and make our filters debuggable and reliable.
Debugging custom filters
So we have implemented a custom filter for unhandled exceptions (here is an example) and registered it using SetUnhandledExceptionFilter function. We simulate an unhandled exception, and our filter is invoked, but it looks like it has a bug – say, it does not properly create the minidump. We want to debug the filter, so we run the application under debugger and set a breakpoint at the beginning of the filter function. The exception is raised again, but there is a surprise – the breakpoint in our filter is not hit, and instead the debugger reports a second chance exception.
Pressing “Continue” does not help. What happened? The answer is simple – custom filters for unhandled exceptions are not called at all when the application is running under debugger. A bit later we will determine exactly why it happens, but now lets look for alternative ways of debugging our filter.
KB173652 offers some help, describing two possible approaches. The first approach is pretty simple, and can be used in nearly all situations where we cannot run the application under debugger – use tracing. It means that our old good friends – OutputDebugString, TRACE, ATLTRACE and other members of this family – can help us again.
Another approach is to modify our application (for debugging purposes only!) to execute the filter from __except clause:
__try { FaultyFunc(); } __except( MyCustomFilter( GetExceptionInformation() ) ) { _tprintf( _T("Exception handled.\n") ); }
(see complete example here)
Filters called inside __except clauses are not skipped when the application is running under debugger. When FaultyFunc raises an exception, breakpoint in our filter will be hit and we will be able to debug the filter. An obvious limitation of this approach is that it is intrusive – we have to write additional code only to debug the filter.
There is also another intrusive approach to debugging filters (much less intrusive, though) – modify the filter to show a message box (or print a console message and then sleep) when it is called. Then we can run the application (but this time not under debugger), simulate an unhandled exception, and wait until the filter gets called and displays the message box. After that we can attach a debugger to the application, set breakpoint in the filter, and dismiss the message box. Our breakpoint will be hit, and we will be able to debug the filter. An example of this approach can be found here.
But is it really necessary to write additional code only to make debugging of the filter possible? May be there is a non-intrusive way of debugging a custom filter? Yes, there is one, and I am going to describe it now.
At the beginning, lets explore how the operating system calls custom filters for unhandled exceptions. If we look at the call stack of the first thread of any Win32 application, we can see that execution always starts at kernel32!BaseProcessStart function. This function receives a pointer to the application's main entry point (usually mainCRTStartup or WinMainCRTStartup in C++ applications) and calls it inside of __try..__except block:
VOID BaseProcessStart( PPROCESS_START_ROUTINE pfnStartAddr ) { __try { ExitThread( (pfnStartAddr)() ); } __except( UnhandledExceptionFilter( GetExceptionInformation()) ) { ExitProcess( GetExceptionCode() ); } }
(this code is a bit simplified; more complete code can be found in Figure 7 of this article)
Execution of all other threads in the process starts with a similar function – kernel32!BaseThreadStart:
VOID BaseThreadStart( PTHREAD_START_ROUTINE pfnStartAddr, PVOID pParam ) { __try { ExitThread( (pfnStartAddr)(pParam) ); } __except( UnhandledExceptionFilter(GetExceptionInformation()) ) { ExitProcess( GetExceptionCode() ); } }
As we can see, if any of the application's threads raises an exception and does not handle it, the control will be passed to kernel32!UnhandledExceptionFilter function. This function offers the last chance processing for all unhandled exceptions. I will not provide the whole list of actions performed by kernel32!UnhandledExceptionFilter (it deserves an article of its own; many details can be found here), but here are the most important steps:
- Check if the exception was raised because of an attempt to write into a read-only memory page inside .rsrc section, and correct the problem by making the memory page writeable.
- If the application is running under debugger, do not handle the exception and pass control to the debugger.
- Call the registered custom filter for unhandled exceptions.
- If the filter chose not to handle the exception, launch the registered just-in-time debugger to debug the application.
Here is the pseudocode of the relevant parts of kernel32!UnhandledExceptionFilter:
LONG UnhandledExceptionFilter( EXCEPTION_POINTERS* pep ) { DWORD rv; EXCEPTION_RECORD* per = pep->ExceptionRecord; // Check for read-only resource access if( ( per->ExceptionCode == EXCEPTION_ACCESS_VIOLATION ) && ( per->ExceptionInformation[0] != 0 ) ) { rv = BasepCheckForReadOnlyResource( per->ExceptionInformation[1] ); if( rv == EXCEPTION_CONTINUE_EXECUTION ) return EXCEPTION_CONTINUE_EXECUTION; } // Is the process running under debugger ? DWORD DebugPort = 0; rv = NtQueryInformationProcess( GetCurrentProcess(), ProcessDebugPort, &DebugPort, sizeof( DebugPort ), 0 ); if( ( rv >= 0 ) && ( DebugPort != 0 ) ) { // Yes, it is -> Pass exception to the debugger return EXCEPTION_CONTINUE_SEARCH; } // Is custom filter for unhandled exceptions registered ? if( BasepCurrentTopLevelFilter != 0 ) { // Yes, it is -> Call the custom filter rv = (BasepCurrentTopLevelFilter)(pep); if( rv == EXCEPTION_EXECUTE_HANDLER ) return EXCEPTION_EXECUTE_HANDLER; if( rv == EXCEPTION_CONTINUE_EXECUTION ) return EXCEPTION_CONTINUE_EXECUTION; } // Proceed to other tasks (check error mode, start JIT debugger, etc.) ... }
Look! The function uses NtQueryInformationProcess to determine whether the application is running under debugger, and whether it should skip calling the custom filter. If we modify the value returned by NtQueryInformationProcess (DebugPort), we can “ask” it to continue and call our filter, even when the application is running under debugger.
So the solution for our problem looks simple – run the application under debugger as usual, but instead of setting a breakpoint in our custom filter, set the breakpoint in kernel32!UnhandledExceptionFilter function. When the breakpoint is hit, we should step through the function until the call to NtQueryInformationProcess returns, and modify the value returned by NtQueryInformationProcess (in 3rd parameter) to zero – to make it look like the application is not really running under debugger.
The only problem is that we don't have the source code of kernel32.dll, so we have to step through disassembly. Now lets take a look at the disassembly of kernel32!UnhandledExceptionFilter function, to familiarize ourselves with its layout and see where we should go and what we should modify.
In Windows 2000 SP4, the relevant parts of the disassembly look like this:
// Setup the stack frame 7c51bd2b push ebp 7c51bd2c mov ebp,esp 7c51bd2e push 0xff 7c51bd30 push 0x7c505788 7c51bd35 push 0x7c4ff0b4 7c51bd3a mov eax,fs:[00000000] 7c51bd40 push eax 7c51bd41 mov fs:[00000000],esp 7c51bd48 push ecx 7c51bd49 push ecx 7c51bd4a sub esp,0x2d8 7c51bd50 push ebx 7c51bd51 push esi 7c51bd52 push edi 7c51bd53 mov [ebp-0x18],esp // Call BasepCheckForReadOnlyResource, if necessary 7c51bd56 mov esi,[ebp+0x8] 7c51bd59 mov eax,[esi] 7c51bd5b cmp dword ptr [eax],0xc0000005 7c51bd61 jnz KERNEL32!UnhandledExceptionFilter+0x53 (7c51bd8a) 7c51bd63 xor ebx,ebx 7c51bd65 cmp [eax+0x14],ebx 7c51bd68 jz KERNEL32!UnhandledExceptionFilter+0x55 (7c51bd8c) 7c51bd6a push dword ptr [eax+0x18] 7c51bd6d call KERNEL32!BasepCheckForReadOnlyResource (7c51bc52) 7c51bd72 cmp eax,0xffffffff 7c51bd75 jnz KERNEL32!UnhandledExceptionFilter+0x55 (7c51bd8c) 7c51bd77 or eax,eax 7c51bd79 mov ecx,[ebp-0x10] 7c51bd7c mov fs:[00000000],ecx 7c51bd83 pop edi 7c51bd84 pop esi 7c51bd85 pop ebx 7c51bd86 leave 7c51bd87 ret 0x4 // Call NtQueryInformationProcess to determine whether the process is being debugged 7c51bd8a xor ebx,ebx 7c51bd8c mov [ebp-0x38],ebx 7c51bd8f push ebx 7c51bd90 push 0x4 7c51bd92 lea eax,[ebp-0x38] 7c51bd95 push eax 7c51bd96 push 0x7 7c51bd98 call KERNEL32!GetCurrentProcess (7c4fe0c8) 7c51bd9d push eax 7c51bd9e call dword ptr [KERNEL32!_imp__NtQueryInformationProcess (7c4e10b8)] 7c51bda4 cmp eax,ebx 7c51bda6 jl KERNEL32!UnhandledExceptionFilter+0x7a (7c51bdb1) // Check the value returned by NtQueryInformationProcess (DebugPort) // // If you want to call the custom filter even when the application is running // under debugger, set [ebp-0x38] to zero // 1--> 7c51bda8 cmp [ebp-0x38],ebx 7c51bdab jne KERNEL32!UnhandledExceptionFilter+0x2c3 (7c51bfda) // Call the custom filter 7c51bdb1 mov eax,[KERNEL32!BasepCurrentTopLevelFilter (7c54144c)] 7c51bdb6 cmp eax,ebx 7c51bdb8 jz KERNEL32!UnhandledExceptionFilter+0x98 (7c51bdc7) 7c51bdba push esi 2--> 7c51bdbb call eax 7c51bdbd cmp eax,0x1 7c51bdc0 jz KERNEL32!UnhandledExceptionFilter+0x2e1 (7c51bd79) 7c51bdc2 cmp eax,0xffffffff 7c51bdc5 jz KERNEL32!UnhandledExceptionFilter+0x2e1 (7c51bd79)
We should set our breakpoint at the beginning of the function. Then we should step through the function's disassembly until we reach the line marked with “1-->”, and set to zero the value stored at EBP-0x38 (in VS debugger, you can enter “EBP-0x38, x” into Watch window to obtain the address). After we have set the value, we should step again until the line marked with “2-->”, where our filter is about to be called. Press F11, and we are inside our filter! (Of course, nobody prevents us from optimizing this process with a couple of additional breakpoints)
What about Windows XP and Windows Server 2003? The disassembly of kernel32!UnhandledExceptionFilter function looks very similar to the one on Windows 2000. Recent service packs introduce some additional steps to the logic of this function (we will talk about some of them later in this article), but our approach remains the same – modify the value returned by NtQueryInformationProcess, then step until we reach the call to our filter, and step into it.
Somebody is overwriting my filter!
Now we know how to debug our filter. But sometimes we can end up in a situation when our filter is not called at all. Why could it happen? Of course, the first suspect is the call to SetUnhandledExceptionFilter function – if it is not called properly, our filter is not registered. But it is difficult to miss this call, right?
After we have verified that our application calls SetUnhandledExceptionFilter properly, we start suspecting that somebody else overwrites our filter. Yes, there can be only one registered filter for the whole process, so any DLL or, for example, a 3rd party control could call SetUnhandledExceptionFilter and register its own filter, disabling ours.
Is there a way to determine whether our filter is registered or not? Yes, there is. The pointer to the currently registered filter is stored in BasepCurrentTopLevelFilter global variable, which is located in kernel32.dll. If we have symbols for kernel32.dll (and symbol server can help us to obtain them), we can use the debugger to see which filter is registered. In Visual Studio debugger, we can use the following expression in Watch window to see the address of the currently registered custom filter:
*(unsigned long*){,,kernel32.dll}_BasepCurrentTopLevelFilter, x
In WinDbg, 'dds' command is very convenient:
> dds kernel32!BasepCurrentTopLevelFilter L1 7c54144c 00401050 MyApp!MyCustomFilter [c:\test\myapp.cpp @ 63]
(note that if incremental linking is enabled, BasepCurrentTopLevelFilter can point to a thunk, but even in that case it is easy to identify the real filter)
Unfortunately, this method is not going to work since Windows XP SP2 and Windows Server 2003 SP1, because the pointer to the filter function is stored in encoded form. Yes, if we look at the disassembly of SetUnhandledExceptionFilter function in Windows XP SP2, we can see that the pointer is passed to EncodePointer function, and the resulting encoded value is stored in BasepCurrentTopLevelFilter variable. Fortunately, after exploring the disassembly of ntdll!RtlEncodePointer function (kernel32!EncodePointer is forwarded to ntdll!RtlEncodePointer), it becomes clear that the encoding performed actually means XORing the pointer value with a process-wide cookie. Further examination reveals that the cookie is stored as part of _EPROCESS structure in kernel memory (_EPROCESS.Cookie), and therefore we can use our favourite kernel debugger (I like WinDbg with LiveKd) to display this value. For example:
> !process <pid> 0 > dt nt!_EPROCESS <address> Cookie
(<address> is the address of _EPROCESS structure, obtained from the output of !process command)
If we don't want to resort to kernel debugger to decode the pointer, we can simply run the application under debugger and set breakpoint at SetUnhandledExceptionFilter to see who is registering custom filters for unhandled exceptions (of course, data breakpoint at kernel32!BasepCurrentTopLevelFilter could be even more reliable source of information, in case if somebody attempts to modify the variable directly).
Here is the output shown by WinDbg when debugging this sample application, after I set a breakpoint at SetUnhandledExceptionFilter using the following command:
0:000> bp kernel32!SetUnhandledExceptionFilter "k;g" 0:000> g ChildEBP RetAddr 0012f820 780020fd KERNEL32!SetUnhandledExceptionFilter 0012f828 780011b3 msvcrt!__CxxSetUnhandledExceptionFilter+0xb 0012f830 78001e29 msvcrt!_initterm+0xf 0012f83c 780010ec msvcrt!_cinit+0x1a 0012f8dc 77f86215 msvcrt!_CRTDLL_INIT+0xec 0012f8fc 77f86f17 ntdll!LdrpCallInitRoutine+0x14 0012f97c 77f8b845 ntdll!LdrpRunInitializeRoutines+0x1df 0012fc98 77f8c295 ntdll!LdrpInitializeProcess+0x802 0012fd1c 77fa15d3 ntdll!LdrpInitialize+0x207 00000000 00000000 ntdll!KiUserApcDispatcher+0x7 ChildEBP RetAddr 0012feb8 0041563e KERNEL32!SetUnhandledExceptionFilter 0012fec4 00415946 MyApp!__CxxSetUnhandledExceptionFilter+0xe 0012fed0 00415699 MyApp!_initterm_e+0x26 0012fee4 00412343 MyApp!_cinit+0x29 0012ffc0 7c4e87f5 MyApp!mainCRTStartup+0x133 0012fff0 00000000 KERNEL32!BaseProcessStart+0x3d ChildEBP RetAddr 0012fe04 00411b7b KERNEL32!SetUnhandledExceptionFilter 0012fedc 00412380 MyApp!main+0x2b 0012ffc0 7c4e87f5 MyApp!mainCRTStartup+0x170 0012fff0 00000000 KERNEL32!BaseProcessStart+0x3d
The output looks interesting – we can see not less than three attempts to register a custom filter, and two of them originate from ... CRT library! This brings us to the next part of our discussion – we will try to determine what components of our application might want to install a custom filter. For now, I can identify three categories:
- CRT library
- .NET runtime
- 3rd party components
Lets approach them one by one, and start with CRT library. It turns out that CRT library relies on a custom filter for unhandled exceptions to implement support for terminate() and related functionality (remember that C++ exceptions are implemented with the help of SEH exceptions). Whenever an unhandled C++ exception is thrown by the application, the custom filter installed by CRT library will catch it and call terminate(). The application can use set_terminate() function to register a terminate handler and get notified about unhandled C++ exceptions. If there is no registered handler, or if the handler returns control, the application is terminated by abort().
Fortunately, the custom filter installed by CRT library is a good citizen (by the way, its name is _CxxUnhandledExceptionFilter), and it does not attempt to handle all possible types of exceptions. If it catches an exception other than “Microsoft C++ Exception” (its code is 0xE06D7363), it calls the previously registered filter, and lets it process the exception.
Filter chaining When we register a custom filter using SetUnhandledExceptionFilter, the function returns a pointer to the previously registered filter. This pointer gives us the option to call the previous filter from ours, if necessary. Why would we want to do it? For example, if our filter only cares about some specific kinds of exceptions, and does not want to take responsibility of others. A good example of this is the filter installed by CRT library, which handles Microsoft C++ exception (exception code 0xE06D7363) and passes all other exceptions to the previously registered filter. Also, if we want to unregister our filter, we can call SetUnhandledExceptionFilter again and pass it the pointer to the previous filter, thus re-registering it. In theory, this feature allows an application and its components to implement a chain of custom filters for unhandled exceptions. As long as every filter in the chain plays by the rules and calls the previous filter, every interested component can be notified about unhandled exceptions and react properly. You can find an example of filter chaining here. In practice, as usual, there are some difficulties. After we have got a pointer to the previous filter, it is not always possible to ensure that the filter can be called safely. What if the filter was registered by a DLL, which is now unloaded? If it is still loaded, is it ready to process the exception? If we don't have the intimate knowledge and control over lifetimes of the application's components, it is probably better not to rely on filter chains, and have only one filter which is responsible for application-wide error handling policy. |
Another interesting issue with filters installed by CRT library is that there can be several instances of CRT inside our process. This is because one CRT library is linked with our main executable, and other instances of CRT can be linked with DLLs loaded by our application. Each of them will try to install its own filter for unhandled exceptions, potentially overwriting our filter.
This situation gets especially tricky in the following scenario:
1. Our application registers its custom filter at startup.
2. After a while, the application loads a DLL linked with its own version of CRT, which registers its own filter.
It means that even if we registered our filter successfully, it can be overridden any time the application loads a DLL.
.NET runtime also uses a custom filter (mscorwks!ComUnhandledExceptionFilter). I don't know .NET as well as Win32, but it is clear that at least the following aspects of .NET functionality are implemented with the help of this filter:
- Managed debugger support (the filter notifies the debugger about unhandled managed exceptions)
- AppDomain.UnhandledException event
- Just-in-time debugging (managed)
And finally, third party components can also install their own filters for unhandled exceptions. Is it good or bad? As always, it depends on the situation. In general, I think that it is not a desirable behavior, because usually the application itself, and not one of its components, should define the policy for error handling and reporting.
Enforcing your own filter
I will reiterate my last sentence: in most cases the application, and not one of its components, should be responsible for application-wide error handling and reporting policy. In particular, it means that if the application registers a filter for unhandled exceptions, nobody else should override it by registering its own filter. How can we achieve it? I don't know a 100% reliable solution, but there are some good opportunities. Lets discuss them.
It's obvious that if we want to make sure that our filter is always registered, we should prevent other parties from registering their filters. API hooking looks like a possible solution – we can hook SetUnhandledExceptionFilter function and reject all attempts to use it after we have registered our filter.
Nowadays, there are two main approaches to API hooking:
- Import Address Table (IAT) hooking
- Detours-like hooking
IAT approach has a problem which makes it unreliable – if the caller obtained the address of the target function (SetUnhandledExceptionFilter in our case) using GetProcAddress function, the call will not go through IAT, and our hook function will not be called.
Detours-like hooking relies on patching the beginning of the target function itself, and therefore it is more reliable – it will not miss calls that can be missed by IAT hooks. The problem with Detours is that it is not publicly available (needs license for commercial use), and implementing a similar, but home grown solution just for hooking SetUnhandledExceptionFilter is not always feasible.
There is one more approach possible, and it is much easier to implement than the previous two. After we have registered our own filter, we can patch the beginning of SetUnhandledExceptionFilter function so that it will not be able to register filters anymore. For example, the following sequence of assembly instructions can be used:
xor eax, eax ret 0x4
EnforceFilter example demonstrates this approach.
Sample code
Complete sample code for the article can be found here.
Conclusion
Custom filters for unhandled exceptions are difficult to debug because they are not called when the application is running under debugger. But if we know how the operating system processes unhandled exceptions, we can change the default behavior and make our filter debuggable.
If our application consists of a large set of components, it can be difficult to implement a reliable error handling policy, because multiple components may want to be notified about unhandled exceptions. But if we know how the custom filters for unhahdled exceptions are registered, we can ensure that our filter always prevails.
Contact
Have questions or comments? Free free to contact Oleg Starodumov at firstname@debuginfo.com.