Recently, while testing our COM+ application it started crashing with c0000005 – Access violation. This is VB6 COM+ called from C# application. The funny thing is, dllhost.exe crashes due to the error but there is no indication which method call crashed it.
Since it is multi-threaded application C# logs loads of RPC Unavailable errors once the COM+ crashes, but nowhere I could find original point where the crash occurred. Actually the thread dies and even if debugged in VS the debugger doesn’t break on this unhandled exception as one would expect, even though it breaks on all other – RPC Unavailable exceptions.
So, the first task in this case, for me at least, would be to find what method is called when the crash occurred. When facing something unexpected like this I resort to WinDbg which I know just enough to get by analysing crash dumps. Necessity on couple of other occasions made me learn the basics so here are some notes.
First we need the crash dump. That’s easily obtained through DebugDiag tool. Instructions on how to get the dump from Microsoft is a bit on the dry side, I prefer instructions with some pictures in it. In this case I needed to select a specific MTS/COM+ application and choose mine from the list. In the list of exceptions I chose C0000005 to generate Full Userdump. Analysing crash dump through DebugDiag tool is normally useful, but in this case did not offer any useful insight.
After exception happens user dump is created and from there on we can use WinDbg. That’s a part of Windows SDK and I selected to install it as Standalone Component. Follow this intro for getting started with WinDbg. Here’s the list of useful commands of which stack commands are particularly useful.
Oh, BTW, you’d better make sure to have debug symbols available for your COM+. In case of VB6 private symbols are rubbish, but at least for exported symbols I got names of methods, otherwise it would’ve been impossible to resolve this.
Lets see what we have in our crash dump:
This dump file has an exception of interest stored in it. The stored exception information can be accessed via .ecxr. (2620.1b6c): Access violation - code c0000005 (first/second chance not available) eax=03d5ba94 ebx=72a1a274 ecx=75ebb100 edx=00a3f140 esi=72a1a008 edi=029df748 eip=75ebb13a esp=00a3f06c ebp=00a3f174 iopl=0 nv up ei pl zr na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246 oleaut32!_IDispatchProxyVtbl+0x26: 75ebb13a 336c0000 xor ebp,dword ptr [eax+eax] ds:0023:07ab7528=????????
So _IDispatchProxyVtbl is crashing. Errrr, looks like vtable – we’re trying to call some method on COM+ … well, I know that OK, lets try analysing this exception:
0:013> !analyze -v ... FAULTING_IP: oleaut32!_IDispatchProxyVtbl+26 75ebb13a 336c0000 xor ebp,dword ptr [eax+eax] EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff) ExceptionAddress: 75ebb13a (oleaut32!_IDispatchProxyVtbl+0x00000026) ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter: 00000000 Parameter: 07ab7528 Attempt to read from address 07ab7528 DEFAULT_BUCKET_ID: INVALID_POINTER_READ PROCESS_NAME: dllhost.exe ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s. EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s. EXCEPTION_PARAMETER1: 00000000 EXCEPTION_PARAMETER2: 07ab7528 READ_ADDRESS: 07ab7528 FOLLOWUP_IP: oleaut32!_IDispatchProxyVtbl+26 75ebb13a 336c0000 xor ebp,dword ptr [eax+eax] MOD_LIST: <ANALYSIS/> NTGLOBALFLAG: 0 APPLICATION_VERIFIER_FLAGS: 0 LAST_CONTROL_TRANSFER: from 75d904e8 to 75ebb13a FAULTING_THREAD: 00001b6c PRIMARY_PROBLEM_CLASS: INVALID_POINTER_READ BUGCHECK_STR: APPLICATION_FAULT_INVALID_POINTER_READ STACK_TEXT: 00a3f174 75d904e8 029df170 00216750 00000202 oleaut32!_IDispatchProxyVtbl+0x26 00a3f190 75df5311 110082bb 00a3f380 00000002 rpcrt4!Invoke+0x2a 00a3f598 759daec1 02a7bfb8 002110a8 00178b88 rpcrt4!NdrStubCall2+0x2d6 00a3f5e0 75ebffd3 02a7bfb8 00178b88 002110a8 ole32!CStdStubBuffer_Invoke+0x3c 00a3f604 759dd876 03d5ba00 00178b88 002110a8 oleaut32!CUnivStubWrapper::Invoke+0xcb 00a3f64c 759dddd0 00178b88 0282dde8 03466750 ole32!SyncStubInvoke+0x3c 00a3f698 758f8a43 00178b88 0019ec48 03d5ba00 ole32!StubInvoke+0xb9 00a3f774 758f8938 002110a8 00000000 03d5ba00 ole32!CCtxComChnl::ContextInvoke+0xfa 00a3f790 758f950a 00178b88 00000001 03d5ba00 ole32!MTAInvoke+0x1a 00a3f7bc 759ddccd 00178b88 00000001 03d5ba00 ole32!STAInvoke+0x46 00a3f7f0 759ddb41 d0908070 002110a8 03d5ba00 ole32!AppInvoke+0xab 00a3f8d0 759de1fd 00178b30 028bc878 00000400 ole32!ComInvokeWithLockAndIPID+0x372 00a3f8f8 758f9367 00178b30 00000400 001af3f8 ole32!ComInvoke+0xc5 00a3f90c 758f9326 00178b30 00a3f9cc 00000400 ole32!ThreadDispatch+0x23 00a3f950 75a1c4e7 00230b56 00000400 0000babe ole32!ThreadWndProc+0x161 ....
Not really useful. I’ve analysed exceptions before and this command’s output was much more useful on those occasions.
So lets see the stack (knf shows the stack with frame number and size of the frame):
0:013> knf # Memory ChildEBP RetAddr 00 00a3f174 75d904e8 oleaut32!_IDispatchProxyVtbl+0x26 01 1c 00a3f190 75df5311 rpcrt4!Invoke+0x2a 02 408 00a3f598 759daec1 rpcrt4!NdrStubCall2+0x2d6 03 48 00a3f5e0 75ebffd3 ole32!CStdStubBuffer_Invoke+0x3c 04 24 00a3f604 759dd876 oleaut32!CUnivStubWrapper::Invoke+0xcb 05 48 00a3f64c 759dddd0 ole32!SyncStubInvoke+0x3c 06 4c 00a3f698 758f8a43 ole32!StubInvoke+0xb9 07 dc 00a3f774 758f8938 ole32!CCtxComChnl::ContextInvoke+0xfa 08 1c 00a3f790 758f950a ole32!MTAInvoke+0x1a 09 2c 00a3f7bc 759ddccd ole32!STAInvoke+0x46 0a 34 00a3f7f0 759ddb41 ole32!AppInvoke+0xab ...
None of these are my Dlls, these are all COM+ infrastructure calls. The size of that rpcrt4!NdrStubCall2 is big enough to rouse suspicion. rpcrt4.dll is part of RPC Architecture and NdrStubCall2 is server-side entry point for RPC Stubs.
We’re nearly there, lets dump the raw stack:
0:013> kd 00a3f174 00a3f190 00a3f178 75d904e8 rpcrt4!Invoke+0x2a 00a3f17c 029df170 00a3f180 00216750 00a3f184 00000202 00a3f188 00000002 00a3f18c 00a3f380 00a3f190 00a3f598 00a3f194 75df5311 rpcrt4!NdrStubCall2+0x2d6 00a3f198 110082bb MyDll!__vba+0x71 00a3f19c 00a3f380 00a3f1a0 00000002 00a3f1a4 e28df787 00a3f1a8 02a7bfb8 00a3f1ac 00178b88 00a3f1b0 00000000 00a3f1b4 00178b88 ......
So the highlighted line looks like something I’m looking for, just not really informative. It seems like it might be a pointer to somewhere in vtable? Lets see:
0:013> u 110082bb MyDll!__vba+0x71: 110082bb e9c0d80000 jmp MyDll!MyObject::Method1+0x50 (11015b80) 110082c0 816c2404ffff0000 sub dword ptr [esp+4],0FFFFh 110082c8 e923df0000 jmp MyDll!MyObject::Method2+0x330 (110161f0) 110082cd 816c2404ffff0000 sub dword ptr [esp+4],0FFFFh 110082d5 e966e30000 jmp MyDll!MyObject::Method3+0x140 (11016640) 110082da 816c2404ffff0000 sub dword ptr [esp+4],0FFFFh 110082e2 e949e50000 jmp MyDll!MyObject::Method4+0x1c0 (11016830) 110082e7 816c2404ffff0000 sub dword ptr [esp+4],0FFFFh
And there it is (note, these are not real method names, __vba is real though). That is the culprit. Well, not really, there was some interoperability issue. The culprit was .NET code, I was passing COM+ objects back and forth, and done something in between to mess up the things. At any rate it was faster to rewrite the method than to chase the exact problem since the method in question is used only for testing.
However, if I hadn’t been able to figure out the name of the method I would’ve never been in the position to even consider a fix.
For proper debugging and analysis of RPC calls see Remote Procedure Call debugging. As I said before, I know enough to get by and in this case, luckily, I didn’t need to dig deeper. On a couple of other occasions I had to and will follow up in further posts.
- Get the dump
- Fire up WinDbg
- !analyze -v to analyse exception
- knf – to get the call stack
- kd – to dump the raw stack
- u address to disassemble at address