Thursday, August 25, 2011

DBH: How the Debugger Help utility can help.

My work in servicing Windows usually results in a bunch of "This component is crashing, please investigate it" questions. One of the first things I always need to do is locate the source code for that component. Lately, I've been using a nice little utility, dbh.exe to accomplish this.

dbh.exe is included with the Windows Debugging tools and it basically provides a command line interface to the dbghelp.dll. For code-finding purposes I'll use the "src" option.

Here's the output on a small utility I'm working on (it's much more useful for large source bases (like Windows) where the code is spread out in many separate source code depots):
c:\>dbh d:\data\projects\base\windd\Debug\windd.exe src *.c*

d:\data\projects\base\windd\windd\main.cpp
d:\data\projects\base\windd\windd\globals.cpp
d:\data\projects\base\windd\windd\datastream.cpp
d:\data\projects\base\windd\windd\commands.cpp
...
I had to cut off some more of the display because this is on a development machine with private Windows symbols -- it displays all the source paths for the Visual Studio CRT, Windows source files that define various GUIDs used by my application, and other miscellaneous junk.

The most important thing I tell people to answer the "Where does the code for XYZ live?" question is: "Check out dbh.exe in the debugging tools." If you have the symbols to your code it's amazing what you can do with the default tools. There are a few features I'd like to see (disassembly of a function with line data, etc...) and perhaps I'll be making my own version of dbh.exe shortly.

There's so much to dbh.exe and I haven't even scratched the surface with this post. I hope you find it as useful as I do.

Wednesday, August 3, 2011

No More Memory Leaks!

Introduction

One of the most common problems in coding is tracking down memory leaks. I remember working at IGT and trying to track down a really nasty problem -- we had some code that seemed to leak a very small amount of memory. On most products you may actually just yell, "Ship It!" and be done with it but this wasn't an option. The code in question had to run for a very long time -- multiple years depending on the quality of power at the location. A slot machine handles money and having software dealing with money slowly consume all the available memory on a system is not acceptable.

Tracking down the offending memory allocation was tedious and time consuming. Reviewing source code and making changes then setting up tests to stress the environment and monitor the memory growth using various system APIs to query available memory. I worked with a great electrical engineer who even spent the extra time to load the data into Excel and come up with graphs. (This didn't really increase our productivity, but it made it look like we were doing something substantial.)

Fast-forward about five years to my work on Microsoft Windows in the Windows Sustained Engineering group. Obviously, memory leaks are still an issue. More importantly, they are trouble for many of our customers. Whenever leaks are reported by customers we need to investigate, find the leak, and fix it. Code review is out of the question for something as substantial as Windows with literally gigabytes of source code. Luckily, there's some great tools available for use on Windows.

User Mode Stack Traces & UMDH

User Mode Stack Traces can be enabled by the gflags.exe utility available in the Debugging Tools for Windows package.
I won't provide a real link here for fear the tag will eventually expire, instead use your favorite search engine to find the latest version. NOTE: Gflags.exe is a powerful and terribly dangerous tool. I don't have time to review all the options available so please be careful.
Gflags is just a utility to enable the appropriate options at the OS level. Windows has plenty of diagnostic information available, this is one of the reasons I consider Windows so "developer friendly." The User Mode Stack Trace options can be found Image File tab of GFlags. You'll need to enter the application name in the image name dialog and hit TAB. Once that's done you can select Create user mode stack trace database to configure tracking for the executable. Make sure you click Apply and OK when done.

Another option is to use the command-line interface of gflags.
gflags.exe -i notepad.exe +ust

Now that Windows will track the allocations, you can start using UMDH to capture and analyze "dumps." UMDH stands for User Mode Dump Heap. UMDH.exe is a handy little utility provided with the Debugging Tools for Windows package which operates in two modes. The first mode creates dumps by accessing the user mode stack trace database windows creates for properly configured processes. The second mode analyzes the differences between any two dumps.

I like to think of UMDH dumps as allocation-state snapshots in time. UMDH allows you to compare two snapshots and view the differences between them. Not only will you see memory allocations but you will also see memory de-allocations.

Putting It Together

With the tools introduction done I can finally get to where the magic happens. Here's an example session using the tools:
--- Setup leaky.exe to track user mode stack traces.
C:\test>c:\debuggers\gflags.exe -i leaky.exe +ust

--- Launch leaky.exe (in a different window)

--- Take the first trace.
C:\test>c:\debuggers\umdh.exe -pn:leaky.exe -f:Dump0.txt

--- Take a second trace after executing "memory leaking" operations.
C:\test>c:\debuggers\umdh.exe -pn:leaky.exe -f:Dump1.txt

--- Use the second mode of UMDH to diff the traces.
C:\test>c:\debuggers\umdh.exe Dump0.txt Dump1.txt > Dump_output.txt

--- View the output.
C:\test>type Dump_output.txt
// _NT_SYMBOL_PATH set by default to C:\Windows\symbols
// Debug library initialized ...
DBGHELP: leaky - private symbols & lines 
        .\leaky.pdb
DBGHELP: ntdll - export symbols
DBGHELP: kernel32 - export symbols
DBGHELP: KERNELBASE - export symbols
//                                                                          
// Each log entry has the following syntax:                                 
//                                                                          
// + BYTES_DELTA (NEW_BYTES - OLD_BYTES) NEW_COUNT allocs BackTrace TRACEID 
// + COUNT_DELTA (NEW_COUNT - OLD_COUNT) BackTrace TRACEID allocations      
//     ... stack trace ...                                                  
//                                                                          
// where:                                                                   
//                                                                          
//     BYTES_DELTA - increase in bytes between before and after log         
//     NEW_BYTES - bytes in after log                                       
//     OLD_BYTES - bytes in before log                                      
//     COUNT_DELTA - increase in allocations between before and after log   
//     NEW_COUNT - number of allocations in after log                       
//     OLD_COUNT - number of allocations in before log                      
//     TRACEID - decimal index of the stack trace in the trace database     
//         (can be used to search for allocation instances in the original  
//         UMDH logs).                                                      
//                                                                          


+    1000 (  1c00 -   c00)      7 allocs BackTrace74F40
+       4 (     7 -     3) BackTrace74F40 allocations

 ntdll!MD5Final+0000B3DD
 leaky!malloc+0000005B (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\malloc.c, 89)
 leaky!operator new+0000001F (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\new.cpp, 59)
 leaky!Leaky+00000015 (d:\data\projects\base\leaky\leaky\main.cpp, 6)
 leaky!main+0000005C (d:\data\projects\base\leaky\leaky\main.cpp, 18)
 leaky!__tmainCRTStartup+0000013B (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crt0.c, 278)
 kernel32!BaseThreadInitThunk+0000000D
 ntdll!RtlUserThreadStart+00000021


Total increase ==   1000 requested +     d0 overhead =   10d0
I can clearly see that leaky!Leaky was called by leaky!main 4 times between the time I took Dump0 and Dump1. I can also see that the amount of memory allocated from these 4 calls was 0x1000 bytes, with a full allocation size including overhead of 0x10d0 bytes. One of the other nice things is I can see this was a C++ "new" allocation and that visual studio eventually converts a "new" into a "malloc" to actually request the memory.

Well, that's a very simple example but I've used this same technique for much more complicated memory leaks in the past -- there's just a few more entries in the call stacks list :). Now you know how to track down (and hopefully prevent) all memory leaks in your code. You can learn a lot when interpreting these UMDH logs. I've included the MSDN article for UMDH log interpretation in the references section. Also, because most system resources are backed by some sort of dynamically allocated memory this technique will also work for other resources leaks as well.

Happy coding!

References

http://support.microsoft.com/kb/268343
http://msdn.microsoft.com/en-us/Library/ff551046(v=VS.85).aspx