Thursday, September 22, 2011

"R6025 - pure virtual function call" Uh oh...

A while ago I stumbled upon a "purecall" crash. The actual fix is pretty boring, but it is interesting to think about why pure-call crashes can happen at all. Since I like looking at disassembly and seeing what the compiler actually does with my code I'll take this approach here.

I recently spoke with a smart friend who's going through the process of learning C++ coding at a university. We started talking about abstract classes and one thing led to another until I brought up pure-call exceptions. More discussion ensued and I posed the question, "How can you actually cause a pure-call exception?"

If you think about it a bit, this should be impossible. There's no way to instantiate an abstract class. The compiler just won't let you. For any real subclasses of the abstract class the compiler will fill in an appropriate function pointer table. So what's the deal? How does this happen?

First, let's have a look at the memory structure of a typical C++ object containing virtual functions.

0:000:x86> ?? tmp
class BaseReal * 0x004a49a0
   +0x000 __VFN_table      : 0x01312110
   +0x004 m_data           : 0x1337beef

0:000:x86> dps 0x01312110 L5
01312110  013110c0 cppstuff!BaseReal::`scalar deleting destructor'
01312114  01311120 cppstuff!BaseReal::Get 
01312118  01311140 cppstuff!BaseReal::Sum 
0131211c  00000000
01312120  00000048

Here's an object tmp which contains a virtual function table pointer __VFN_table and then a single data element called m_data. I can see from dumping pointer-sized chunks of __VFN_table with symbol matching turned on that it's actually got the function pointers for the class called BaseReal.

This corresponds to the following code listing:

#include <stdio.h>
#include <stdlib.h>

class BaseAbstract
    BaseAbstract ();
    virtual ~BaseAbstract ();
    virtual unsigned int Get () = 0;
    virtual unsigned int Sum ();

class BaseReal : public BaseAbstract
    BaseReal ();
    virtual ~BaseReal ();
    virtual unsigned int Get ();
    virtual unsigned int Sum ();
    unsigned int m_data;

BaseAbstract::BaseAbstract ()
    //Sum ();  // BOOM.

BaseAbstract::~BaseAbstract ()

unsigned int BaseAbstract::Sum ()
    return Get () + 0;

BaseReal::BaseReal () : BaseAbstract ()
    m_data = 0x1337BEEF;

BaseReal::~BaseReal ()
    m_data = 0xDEADBEEF;

unsigned int BaseReal::Get ()
    return m_data;

unsigned int BaseReal::Sum ()
    return Get () + m_data;

int main (int argc, char* argv[])
    BaseReal *tmp = new BaseReal ();
    unsigned int value = tmp->Sum ();   // <<--- break point here.

    delete tmp;
    return 0;

If I uncomment the "BOOM" line and run again the application will crash before it gets to the break point. The interesting part is what happens before the crash. Let's have a look at the constructors disassembly. First, the BaseReal constructor:

0:000:x86> uf cppstuff!BaseReal::BaseReal
cppstuff!BaseReal::BaseReal :

   // Function prologue...
   41 00df1090 55              push    ebp
   41 00df1091 8bec            mov     ebp,esp

   // Setting up the "this" pointer (ecx usually contains 'this') and then
   // calling the BaseAbstract constructor.
   41 00df1093 51              push    ecx
   41 00df1094 894dfc          mov     dword ptr [ebp-4],ecx
   41 00df1097 8b4dfc          mov     ecx,dword ptr [ebp-4]
   41 00df109a e861ffffff      call    cppstuff!BaseAbstract::BaseAbstract (00df1000)

   // Loading eax with pointer to 'this' and then storing the virtual function
   // table for BaseReal (cppstuff!BaseReal::`vftable' (00df2120)
   41 00df109f 8b45fc          mov     eax,dword ptr [ebp-4]
   41 00df10a2 c7002021df00    mov     dword ptr [eax],
            offset cppstuff!BaseReal::`vftable' (00df2120)

   // Saving 0x1337BEEF to m_data.
   42 00df10a8 8b4dfc          mov     ecx,dword ptr [ebp-4]
   42 00df10ab c74104efbe3713  mov     dword ptr [ecx+4],1337BEEFh

   // Function epilogue...
   43 00df10b2 8b45fc          mov     eax,dword ptr [ebp-4]
   43 00df10b5 8be5            mov     esp,ebp
   43 00df10b7 5d              pop     ebp
   43 00df10b8 c3              ret

// Dumping the function table...
0:000:x86> dps cppstuff!BaseReal::`vftable' L3
00df2120  00df10c0 cppstuff!BaseReal::`scalar deleting destructor'
00df2124  00df1120 cppstuff!BaseReal::Get 
00df2128  00df1140 cppstuff!BaseReal::Sum 

This looks pretty reasonable. First there's the function prologue and then we do some C++ "this pointer" setup to make all that work. After that we immediately jump into the constructor for BaseAbstract. Once that work is done the m_data member is initialized. And now a look at the BaseAbstract constructor:

0:000:x86> uf cppstuff!BaseAbstract::BaseAbstract
cppstuff!BaseAbstract::BaseAbstract :

   // Function prologue...
   27 00df1000 55              push    ebp
   27 00df1001 8bec            mov     ebp,esp

   // Setting up the "this" pointer (ecx usually contains 'this') and 
   // saving it on the stack as a local in preparation for calling
   // the "Sum" function.
   27 00df1003 51              push    ecx
   27 00df1004 894dfc          mov     dword ptr [ebp-4],ecx

   // Loading eax with pointer to 'this' and then storing the virtual function
   // table for BaseReal (cppstuff!BaseAbstract::`vftable' (00df2110)
   27 00df1007 8b45fc          mov     eax,dword ptr [ebp-4]
   27 00df100a c7001021df00    mov     dword ptr [eax],offset 
            cppstuff!BaseAbstract::`vftable' (00df2110)
   28 00df1010 8b4dfc          mov     ecx,dword ptr [ebp-4]

   // Calling Sum -- which will fail.
   28 00df1013 e858000000      call    cppstuff!BaseAbstract::Sum (00df1070)

   // Function epilogue...
   29 00df1018 8b45fc          mov     eax,dword ptr [ebp-4]
   29 00df101b 8be5            mov     esp,ebp
   29 00df101d 5d              pop     ebp
   29 00df101e c3              ret

// Dumping the function table...
0:000:x86> dps cppstuff!BaseAbstract::`vftable' L3
00df2110  00df1020 cppstuff!BaseAbstract::`scalar deleting destructor'
00df2114  00df1224 cppstuff!purecall
00df2118  00df1070 cppstuff!BaseAbstract::Sum 

// Disassembly for BaseAbstract::Sum -- called in the constructor.
0:000:x86> uf cppstuff!BaseAbstract::Sum
cppstuff!BaseAbstract::Sum :

   // Prologue...
   36 00df1070 55              push    ebp
   36 00df1071 8bec            mov     ebp,esp

   // Saving ecx.  This is somewhat important.  Note: the "this call"
   // calling convention requires the "this pointer" to be in ecx.  The code
   // is using ebp-4 to stash the "this" pointer.
   36 00df1073 51              push    ecx

   // Copying "this" (ecx) to local storage -- anything with negative
   // ebp references is a local / spill location.
   36 00df1074 894dfc          mov     dword ptr [ebp-4],ecx

   // Load the address of the function table into eax.
   37 00df1077 8b45fc          mov     eax,dword ptr [ebp-4]

   // Dereference eax into edx -- now we have the function table.
   37 00df107a 8b10            mov     edx,dword ptr [eax]

   // Set ecx to "this" for the call, per calling convention.
   37 00df107c 8b4dfc          mov     ecx,dword ptr [ebp-4]

   // Deference the 2nd function table entry (the one for "Get").
   37 00df107f 8b4204          mov     eax,dword ptr [edx+4]
   37 00df1082 ffd0            call    eax

   // Epilogue...
   38 00df1084 8be5            mov     esp,ebp
   38 00df1086 5d              pop     ebp
   38 00df1087 c3              ret

There's the same "this" pointer initialization (although this was probably already done, these constructors have to work in a vacuum, so they may duplicate a little work). Next, the setup of the virtual function table and then the call to Sum. I think the compiler took a nice optimization here and didn't use the virtual function table to get the address of Sum. If this were code anywhere other than the constructor I imagine it would have used the function table pointer instead.

So now this brings us to the Sum code, which I also dumped. You can see it dereferences the virtual function table for the Get function call and then calls it. The problem is this is a pure virtual function so the table entry is for cppstuff!purecall; which is a function added by the compiler as a placeholder to indicate failure.

What are the lessons learned? You should never call virtual functions (or functions that call virtual functions) in the constructor or destructor. I didn't show the destructor code, but the whole process of loading the proper function table pointer and setting it is reversed.

Clear as mud?

Wednesday, September 14, 2011

Header Files and Named Parameters

If you're as much of a geek as I am you may hang out in IRC channels chatting it up with your other coder friends. Occasionally, you may also get into a heated debate about one topic or another.

About a week ago I got into just such a debate (and the geekness of it is a little embarrassing). We were debating whether you should name your parameters in header files or not. As a "software engineer" it's my belief that you should make the actual act of coding as easy as possible. There are enough ways to shoot yourself in the foot and it's always best to minimize complexity and leave that intellectual horsepower to the actual problem at hand.

I find that most seasoned "software engineers" feel this way as well. I've run into a few "programming is just hard" coders, but I believe a real "engineer" would care a little more about their users and also the interesting and unexpected problems that can arise from using other software packages. Given all this, of course I was on the "please, for the love of God, name those parameters" side of the discussion. Well, turns out I was wrong. My decision to include named parameters can actually make programming a LOT harder. Here's why:

Suppose I have 3 files, defined here:

// lib.h
#ifndef __LIB_H__
#define __LIB_H__

long sum (long int_arg, long int_arg2);

// lib.c
#include "lib.h"

long sum (long int_arg, long int_arg2)
    return (int_arg + int_arg2);
// main.c
#include <stdio.h>
#include <stdlib.h>
#include "lib.h"

int main (int argc, char *argv[])
    fprintf (stdout, "%d\n", sum (10, 10));
    return 0;

This all seems like it's going to work perfectly. If I compile and run this, I get "20" for my output. Everything is perfect.

Now then, suppose someone changes my main.c like so:

// main.c
#define int_arg long

#include <stdio.h>
#include <stdlib.h>
#include "lib.h"

int main (int argc, char *argv[])
    fprintf (stdout, "%d\n", sum (10, 10));
    return 0;

Now my output is the most unexpected: "10"

I know you may wonder why exactly someone would define int_arg to be something like long and I agree this is somewhat contrived. But, it can happen. This was the simplest plausible case I could come up with. Perhaps the coder is using a library that allows you to change the certain behavior by defining what an "int_arg" should be. Perhaps the library is using the "int_arg" type to indicate what type of integers it should read from a file. I know all of this is a stretch, but the important thing to remember is that this can happen. As all good engineers know, if it can happen it most likely will happen eventually.

Hmmm, so what happened here? Let's have a look at some disassembly to see.

// outputs "20"
  004113E0: 55                 push        ebp
  004113E1: 8B EC              mov         ebp,esp
  004113E3: 81 EC C0 00 00 00  sub         esp,0C0h
  004113E9: 53                 push        ebx
  004113EA: 56                 push        esi
  004113EB: 57                 push        edi
  004113EC: 8D BD 40 FF FF FF  lea         edi,[ebp-0C0h]
  004113F2: B9 30 00 00 00     mov         ecx,30h
  004113F7: B8 CC CC CC CC     mov         eax,0CCCCCCCCh
  004113FC: F3 AB              rep stos    dword ptr es:[edi]
  004113FE: 6A 0A              push        0Ah
  00411400: 6A 0A              push        0Ah
  00411402: E8 EE FC FF FF     call        @ILT+240(_sum)
  00411407: 83 C4 08           add         esp,8
  0041140A: 8B F4              mov         esi,esp

// outputs "10"
  004113E0: 55                 push        ebp
  004113E1: 8B EC              mov         ebp,esp
  004113E3: 81 EC C0 00 00 00  sub         esp,0C0h
  004113E9: 53                 push        ebx
  004113EA: 56                 push        esi
  004113EB: 57                 push        edi
  004113EC: 8D BD 40 FF FF FF  lea         edi,[ebp-0C0h]
  004113F2: B9 30 00 00 00     mov         ecx,30h
  004113F7: B8 CC CC CC CC     mov         eax,0CCCCCCCCh
  004113FC: F3 AB              rep stos    dword ptr es:[edi]
  004113FE: 6A 0A              push        0Ah
  00411400: 6A 00              push        0
  00411402: 6A 0A              push        0Ah
  00411404: E8 EC FC FF FF     call        @ILT+240(_sum)
  00411409: 83 C4 0C           add         esp,0Ch
  0041140C: 8B F4              mov         esi,esp

Huh, so that's weird. Looks like if int_arg is defined as a long then the compiler seems to think it should pass an extra 32-bits to the "sum" function. You can also see this in the "stack cleanup" code directly after the function call -- there's an extra 4 bytes being pulled off the stack. The extra 4 bytes are coming from that extra push onto the stack before the function call. So why is this?

Well, the compiler is converting long int_arg to be long long and then assuming it's an unnamed parameter. Later on, the actual implementation doesn't have the define and is compiled as expecting a long, not a long long. Well, that's odd, but you may ask, "It's going to throw some linker warnings, right?"

Well, no. The reason this whole thing compiles and links without warnings is because these are C functions. There is no name decoration. The linker is simply looking for a function called sum that it can use. It finds one and no linker-time type-checking is performed. The only reason C++ has any sort of name declaration is because functions can be overloaded with different parameter types. In C++ this would be a linker error but in C it's just a horrible horrible bug.

So there's a case where I was advocating for making things easier only to possibly introduce some really hard-to-find bugs. How would I fix this? I'd add some nice comments to these functions in the header file, which is a common convention. Additionally, I know understand why this convention is used.

// lib.h
#ifndef __LIB_H__
#define __LIB_H__

long sum (long /* int_arg */, long /* int_arg2 */);



Friday, September 2, 2011

WinDBG microtip

I constantly find myself reading variables and structures the hard way in WinDBG... Why don't I ever just use the "??" command!?

Thursday, August 25, 2011

DBH: How the Debugger Help utility can help.

My work in servicing Windows usually results in a bunch of "This component is crashing, please investigate it" questions. One of the first things I always need to do is locate the source code for that component. Lately, I've been using a nice little utility, dbh.exe to accomplish this.

dbh.exe is included with the Windows Debugging tools and it basically provides a command line interface to the dbghelp.dll. For code-finding purposes I'll use the "src" option.

Here's the output on a small utility I'm working on (it's much more useful for large source bases (like Windows) where the code is spread out in many separate source code depots):
c:\>dbh d:\data\projects\base\windd\Debug\windd.exe src *.c*

I had to cut off some more of the display because this is on a development machine with private Windows symbols -- it displays all the source paths for the Visual Studio CRT, Windows source files that define various GUIDs used by my application, and other miscellaneous junk.

The most important thing I tell people to answer the "Where does the code for XYZ live?" question is: "Check out dbh.exe in the debugging tools." If you have the symbols to your code it's amazing what you can do with the default tools. There are a few features I'd like to see (disassembly of a function with line data, etc...) and perhaps I'll be making my own version of dbh.exe shortly.

There's so much to dbh.exe and I haven't even scratched the surface with this post. I hope you find it as useful as I do.

Wednesday, August 3, 2011

No More Memory Leaks!


One of the most common problems in coding is tracking down memory leaks. I remember working at IGT and trying to track down a really nasty problem -- we had some code that seemed to leak a very small amount of memory. On most products you may actually just yell, "Ship It!" and be done with it but this wasn't an option. The code in question had to run for a very long time -- multiple years depending on the quality of power at the location. A slot machine handles money and having software dealing with money slowly consume all the available memory on a system is not acceptable.

Tracking down the offending memory allocation was tedious and time consuming. Reviewing source code and making changes then setting up tests to stress the environment and monitor the memory growth using various system APIs to query available memory. I worked with a great electrical engineer who even spent the extra time to load the data into Excel and come up with graphs. (This didn't really increase our productivity, but it made it look like we were doing something substantial.)

Fast-forward about five years to my work on Microsoft Windows in the Windows Sustained Engineering group. Obviously, memory leaks are still an issue. More importantly, they are trouble for many of our customers. Whenever leaks are reported by customers we need to investigate, find the leak, and fix it. Code review is out of the question for something as substantial as Windows with literally gigabytes of source code. Luckily, there's some great tools available for use on Windows.

User Mode Stack Traces & UMDH

User Mode Stack Traces can be enabled by the gflags.exe utility available in the Debugging Tools for Windows package.
I won't provide a real link here for fear the tag will eventually expire, instead use your favorite search engine to find the latest version. NOTE: Gflags.exe is a powerful and terribly dangerous tool. I don't have time to review all the options available so please be careful.
Gflags is just a utility to enable the appropriate options at the OS level. Windows has plenty of diagnostic information available, this is one of the reasons I consider Windows so "developer friendly." The User Mode Stack Trace options can be found Image File tab of GFlags. You'll need to enter the application name in the image name dialog and hit TAB. Once that's done you can select Create user mode stack trace database to configure tracking for the executable. Make sure you click Apply and OK when done.

Another option is to use the command-line interface of gflags.
gflags.exe -i notepad.exe +ust

Now that Windows will track the allocations, you can start using UMDH to capture and analyze "dumps." UMDH stands for User Mode Dump Heap. UMDH.exe is a handy little utility provided with the Debugging Tools for Windows package which operates in two modes. The first mode creates dumps by accessing the user mode stack trace database windows creates for properly configured processes. The second mode analyzes the differences between any two dumps.

I like to think of UMDH dumps as allocation-state snapshots in time. UMDH allows you to compare two snapshots and view the differences between them. Not only will you see memory allocations but you will also see memory de-allocations.

Putting It Together

With the tools introduction done I can finally get to where the magic happens. Here's an example session using the tools:
--- Setup leaky.exe to track user mode stack traces.
C:\test>c:\debuggers\gflags.exe -i leaky.exe +ust

--- Launch leaky.exe (in a different window)

--- Take the first trace.
C:\test>c:\debuggers\umdh.exe -pn:leaky.exe -f:Dump0.txt

--- Take a second trace after executing "memory leaking" operations.
C:\test>c:\debuggers\umdh.exe -pn:leaky.exe -f:Dump1.txt

--- Use the second mode of UMDH to diff the traces.
C:\test>c:\debuggers\umdh.exe Dump0.txt Dump1.txt > Dump_output.txt

--- View the output.
C:\test>type Dump_output.txt
// _NT_SYMBOL_PATH set by default to C:\Windows\symbols
// Debug library initialized ...
DBGHELP: leaky - private symbols & lines 
DBGHELP: ntdll - export symbols
DBGHELP: kernel32 - export symbols
DBGHELP: KERNELBASE - export symbols
// Each log entry has the following syntax:                                 
// + COUNT_DELTA (NEW_COUNT - OLD_COUNT) BackTrace TRACEID allocations      
//     ... stack trace ...                                                  
// where:                                                                   
//     BYTES_DELTA - increase in bytes between before and after log         
//     NEW_BYTES - bytes in after log                                       
//     OLD_BYTES - bytes in before log                                      
//     COUNT_DELTA - increase in allocations between before and after log   
//     NEW_COUNT - number of allocations in after log                       
//     OLD_COUNT - number of allocations in before log                      
//     TRACEID - decimal index of the stack trace in the trace database     
//         (can be used to search for allocation instances in the original  
//         UMDH logs).                                                      

+    1000 (  1c00 -   c00)      7 allocs BackTrace74F40
+       4 (     7 -     3) BackTrace74F40 allocations

 leaky!malloc+0000005B (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\malloc.c, 89)
 leaky!operator new+0000001F (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\new.cpp, 59)
 leaky!Leaky+00000015 (d:\data\projects\base\leaky\leaky\main.cpp, 6)
 leaky!main+0000005C (d:\data\projects\base\leaky\leaky\main.cpp, 18)
 leaky!__tmainCRTStartup+0000013B (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crt0.c, 278)

Total increase ==   1000 requested +     d0 overhead =   10d0
I can clearly see that leaky!Leaky was called by leaky!main 4 times between the time I took Dump0 and Dump1. I can also see that the amount of memory allocated from these 4 calls was 0x1000 bytes, with a full allocation size including overhead of 0x10d0 bytes. One of the other nice things is I can see this was a C++ "new" allocation and that visual studio eventually converts a "new" into a "malloc" to actually request the memory.

Well, that's a very simple example but I've used this same technique for much more complicated memory leaks in the past -- there's just a few more entries in the call stacks list :). Now you know how to track down (and hopefully prevent) all memory leaks in your code. You can learn a lot when interpreting these UMDH logs. I've included the MSDN article for UMDH log interpretation in the references section. Also, because most system resources are backed by some sort of dynamically allocated memory this technique will also work for other resources leaks as well.

Happy coding!