Saturday, September 6, 2014

Side projects... what next?

Lately I've been having a hard time deciding which side project to work on at home. Of course, having a seven month old has severely limited how much time I have for side projects, but I find it's still important to work out some "work-suppressed" creativity. A coworker from my IGT days has been wanting to work on a game for a few weeks now. He's been working with Unity and probably wants to work out some of his own work-suppressed creativity. This guy is an awesome 3D artist / modeler. I really enjoyed working with him at IGT. We worked together on the last large project I did at IGT before joining Microsoft and I had a lot of fun. He taught me about skeletal animation and we worked together to extend an existing toolchain to support it. That was some of the most fun I've had in a job.

However, I'm also becoming extremely interested in operations systems. I've taken my "OS" to the point where I really need to start making decisions about early boot environment and handoff to the kernel. Lately, I've been working on a FAT32 format utility specific to my OS. I need a custom format utility so I can add my own boot sector code and lay out the supplemental sectors accordingly. FAT32 is surprisingly simple, but also not all that interesting so the project has stalled here for a bit. Perhaps I'll just skip this and use the same 10MB "chunk" for booting as I have been. The problem is I want to start adding executable image and loader support, along with other things, so I figure I might as well have a real filesystem to work against.

There's a few other smaller projects I've been thinking about as well. The diff utility based on a "rolling hash" function. I had a beta of this working in C# with WPF rendering and it worked well. My next step is to turn it into a C library / DLL which I can use in multiple other utilities I'd like to create. I've also looked at making a code editor a few times. I'll usually get hung up on Unicode support because it's one of those things you really kind of need to plan for up front. While this may not have resulted in much of a code editor, I certainly learned a bunch about Unicode and also thought up the concept of an index tree. I've also thought about starting up work on my fractal terrain generation and perlin noise algorithms again. I was using XNA and there was some limitation on the type of integer calculations I perform efficiently in the shader. I can't remember the details since it was a couple years ago, but I'd definitely like to revisit using newer DirectX or OpenGL shaders which may not have the same limitation I encountered.

Yep... A lot of choices. I also had the beginnings of 3D graphics rendering engine in the works -- it was using Blender as the creation framework and I had python scripts to export into the engine formats. It was coming along. What I'd really like would be to have a job where I could work on one of these things to the point it's useful.

Friday, September 5, 2014

Microtip: Stack corruption? windbg dps to the rescue.

Everyone encounters it sooner or later. A crash with a corrupted callstack. You know what the stack pointer is, but it is clearly not unwinding properly. With a little elbow grease I've found the WinDbg command "dps" to be extremely helpful in this situation.

With this, you're telling the debugger to dump data, in "pointer size" chunks, and also try to match against loaded symbols. You'll quickly start seeing *some* sort of callstack with this display. Now the trick is to sort through the various locations to find the actual callstack. Windbg can help with this as well, you can tell it where the stack actually starts using the kn = [base address] command to verify if what you think is a good stack really is one. The trick is to look at address offsets. Pretty much all real function return addresses will have a non-zero offset. Usually, there are a small enough number of these that you can just brute force it -- like you see me doing (although I skipped a few for brevity). If there are a ton of these potential addresses, it may require additional detective work. Also, one important gotcha -- there may appear to be more than one valid stack, so at this point you need to read the source code or disassembly, and determine what makes the most sense. Also, knowing the ABI of the platform you're working on is invaluable.

This is a little tedious, but the worst bugs only rarely repro and can hold up a product release...

0:000> kn
 # Child-SP          RetAddr           Call Site
00 0000006d`6dcdf310 0000006d`6dd80000 ntdll!RtlEnterCriticalSection+0x22
01 0000006d`6dcdf318 6f6e6d6c`402c0062 0x0000006d`6dd80000
02 0000006d`6dcdf320 00000000`00000470 0x6f6e6d6c`402c0062
03 0000006d`6dcdf328 00000000`00000480 0x470
04 0000006d`6dcdf330 00000000`00000000 0x480
0:000> * Oh noooooooo!!!!
0:000> dps 0000006d`6dcdf310
0000006d`6dcdf310  0000006d`6dd80000
0000006d`6dcdf318  6f6e6d6c`402c0062
0000006d`6dcdf320  00000000`00000470
0000006d`6dcdf328  00000000`00000480
0000006d`6dcdf330  00000000`00000000
0000006d`6dcdf338  0000006d`6dcdf428
0000006d`6dcdf340  77767574`73727170
0000006d`6dcdf348  00007fff`ba9738d8 KERNELBASE!VirtualQuery+0x28
0000006d`6dcdf350  87868584`83828180
0000006d`6dcdf358  8f9e8d9c`8b9a8988
0000006d`6dcdf360  97969594`93929190
0000006d`6dcdf368  ff9e9d9c`9b9a9998
0000006d`6dcdf370  00000000`00000030
0000006d`6dcdf378  0000006d`6dcdf3a8
0000006d`6dcdf380  0000006d`6dcdf440
0000006d`6dcdf388  00007fff`ba983552 KERNELBASE!SetUnhandledExceptionFilter+0x24a
0:000> dps
0000006d`6dcdf390  e7e6e5e4`e3e2e1e0
0000006d`6dcdf398  efeeedec`ebeae9e8
0000006d`6dcdf3a0  d7f6f5f4`f3f2f1f0
0000006d`6dcdf3a8  00000000`00000030
0000006d`6dcdf3b0  00000000`0000021a
0000006d`6dcdf3b8  00000000`00000000
0000006d`6dcdf3c0  00007ff6`d6290000 MicroLogUtil!__ImageBase
0000006d`6dcdf3c8  01000000`00000080
0000006d`6dcdf3d0  00000000`00095000
0000006d`6dcdf3d8  00007fff`bd24dcb7 ntdll!RtlDecodePointer+0x27
0000006d`6dcdf3e0  00000000`00000000
0000006d`6dcdf3e8  00000000`00000000
0000006d`6dcdf3f0  00000000`00000000
0000006d`6dcdf3f8  00000000`00000000
0000006d`6dcdf400  00000000`00000000
0000006d`6dcdf408  00007ff6`d62a5ec0 MicroLogUtil!__CxxUnhandledExceptionFilter
0:000> dps
0000006d`6dcdf410  000012b3`e0966000
0000006d`6dcdf418  00007fff`ba983476 KERNELBASE!SetUnhandledExceptionFilter+0x16e
0000006d`6dcdf420  00000000`00000000
0000006d`6dcdf428  00000000`959f04b3
0000006d`6dcdf430  0000006d`6dd86480
0000006d`6dcdf438  00007ff6`d6295a90 MicroLogUtil!_lock+0x50
0000006d`6dcdf440  00007ff6`d6290000 MicroLogUtil!__ImageBase
0000006d`6dcdf448  00000000`00095000
0000006d`6dcdf450  03810381`01000000
0000006d`6dcdf458  00000000`005a0058
0000006d`6dcdf460  0000006d`6dcdf468
0000006d`6dcdf468  00007ff6`d6299232 MicroLogUtil!_heap_alloc_dbg_impl+0x32
0000006d`6dcdf470  00000000`00000004
0000006d`6dcdf478  00007fff`bd2e1ebc ntdll!RtlZeroHeap+0x6e8
0000006d`6dcdf480  006b0073`00690064
0000006d`6dcdf488  0075006c`006f0056
0:000> dps
0000006d`6dcdf490  005c0032`0065006d
0000006d`6dcdf498  005c0070`006d0074
0000006d`6dcdf4a0  00720063`0069004d
0000006d`6dcdf4a8  0067006f`10070017
0000006d`6dcdf4b0  00000000`6dd86310
0000006d`6dcdf4b8  00007fff`bd2be5e7 ntdll!memset+0x1fa27
0000006d`6dcdf4c0  00000000`00000000
0000006d`6dcdf4c8  00000000`00000001
0000006d`6dcdf4d0  0000006d`6dd80000
0000006d`6dcdf4d8  00007ff6`d6299ab9 MicroLogUtil!_nh_malloc_dbg_impl+0x39
0000006d`6dcdf4e0  00000000`00000030
0000006d`6dcdf4e8  00000000`00000001
0000006d`6dcdf4f0  00000000`00000000
0000006d`6dcdf4f8  00007fff`00000000
0000006d`6dcdf500  0000006d`6dcdf560
0000006d`6dcdf508  00000000`00000000
0:000> dps
0000006d`6dcdf510  00000000`00000000
0000006d`6dcdf518  02100210`02100210
0000006d`6dcdf520  00000000`00000101
0000006d`6dcdf528  00007ff6`d6299a49 MicroLogUtil!_nh_malloc_dbg+0x49
0000006d`6dcdf530  00000000`00000030
0000006d`6dcdf538  00007ff6`00000000
0000006d`6dcdf540  00000000`00000001
0000006d`6dcdf548  00000000`00000000
0000006d`6dcdf550  00000000`00000000
0000006d`6dcdf558  0000006d`6dcdf560
0000006d`6dcdf560  00000000`00000000
0000006d`6dcdf568  00000000`00000000
0000006d`6dcdf570  00000000`00000000
0000006d`6dcdf578  00007ff6`d6294b7a MicroLogUtil!malloc+0x2a
0000006d`6dcdf580  00000000`00000030
0000006d`6dcdf588  0000006d`00000000
0:000> dps
0000006d`6dcdf590  0000006d`00000001
0000006d`6dcdf598  00000000`00000000
0000006d`6dcdf5a0  00000000`00000000
0000006d`6dcdf5a8  00007ff6`d6295db2 MicroLogUtil!_unlock+0x22
0000006d`6dcdf5b0  00000000`00000000
0000006d`6dcdf5b8  02100302`00000030
0000006d`6dcdf5c0  00007ff6`00000000
0000006d`6dcdf5c8  00007ff6`d62915e6 MicroLogUtil!AstNodeCreate+0x26
0000006d`6dcdf5d0  00000000`00000030
0000006d`6dcdf5d8  00007ff6`d62999b4 MicroLogUtil!_msize_dbg+0x234
0000006d`6dcdf5e0  0000006d`00000004
0000006d`6dcdf5e8  00007ff6`d62aa677 MicroLogUtil!_setmbcp_nolock+0x447
0000006d`6dcdf5f0  00000000`00000000
0000006d`6dcdf5f8  00007fff`bd265a63 ntdll!RtlEncodePointer+0x27
0000006d`6dcdf600  00000000`00000100
0000006d`6dcdf608  00007ff6`d62919e8 MicroLogUtil!AstStateInit+0x48
0:000> dps
0000006d`6dcdf610  00000000`00000000
0000006d`6dcdf618  00007ff6`d631bec0 MicroLogUtil!pairNode `RTTI Type Descriptor'+0x5a0
0000006d`6dcdf620  00000000`00000001
0000006d`6dcdf628  0000006d`6dcdf680
0000006d`6dcdf630  00000000`00000000
0000006d`6dcdf638  00007ff6`d6291115 MicroLogUtil!ProcessFile+0x85
0000006d`6dcdf640  0000006d`6dcdf6c0
0000006d`6dcdf648  0000006d`6dd827c8
0000006d`6dcdf650  00000000`00000000
0000006d`6dcdf658  00007fff`bd265a63 ntdll!RtlEncodePointer+0x27
0000006d`6dcdf660  00000000`00000002
0000006d`6dcdf668  00007ff6`d6307020 MicroLogUtil!`string'
0000006d`6dcdf670  0000006d`00000000
0000006d`6dcdf678  00000000`00000000
0000006d`6dcdf680  00000000`00000000
0000006d`6dcdf688  00007ff6`d62a5414 MicroLogUtil!__crtSetUnhandledExceptionFilter+0x14
0:000> dps
0000006d`6dcdf690  00000000`00000000
0000006d`6dcdf698  00000000`00000000
0000006d`6dcdf6a0  00000000`00000000
0000006d`6dcdf6a8  00000000`00000000
0000006d`6dcdf6b0  00007ff6`d62a7810 MicroLogUtil!_RTC_Terminate
0000006d`6dcdf6b8  00007ff6`d62a5f50 MicroLogUtil!__CxxSetUnhandledExceptionFilter+0x10
0000006d`6dcdf6c0  00000000`00000000
0000006d`6dcdf6c8  00000000`00000000
0000006d`6dcdf6d0  00000000`00000000
0000006d`6dcdf6d8  00000000`00000000
0000006d`6dcdf6e0  00000000`00000000
0000006d`6dcdf6e8  00000000`00000000
0000006d`6dcdf6f0  00000000`00000000
0000006d`6dcdf6f8  00000000`00000000
0000006d`6dcdf700  00000000`00000000
0000006d`6dcdf708  00000000`00000000
0:000> * Hmmm, that looks a little more promising...
0:000> * Looking for offsets that make sense... is it the VirtualQuery+0x28???
0:000> kn = 0000006d`6dcdf348
 # Child-SP          RetAddr           Call Site
00 0000006d`6dcdf348 00007fff`ba9738d8 ntdll!RtlEnterCriticalSection+0x22
01 0000006d`6dcdf350 00007fff`ba983552 KERNELBASE!VirtualQuery+0x28
02 0000006d`6dcdf390 00007fff`ba983476 KERNELBASE!SetUnhandledExceptionFilter+0x24a
03 0000006d`6dcdf420 00007ff6`d62a5414 KERNELBASE!SetUnhandledExceptionFilter+0x16e
04 0000006d`6dcdf690 00007ff6`d62a5f50 MicroLogUtil!__crtSetUnhandledExceptionFilter+0x14
05 0000006d`6dcdf6c0 00000000`00000000 MicroLogUtil!__CxxSetUnhandledExceptionFilter+0x10
0:000> * Hmm, doesn't look like it.  How about  KERNELBASE!SetUnhandledExceptionFilter+0x24a?
0:000> kn = 0000006d`6dcdf388
 # Child-SP          RetAddr           Call Site
00 0000006d`6dcdf388 00007fff`ba983552 ntdll!RtlEnterCriticalSection+0x22
01 0000006d`6dcdf390 00007fff`ba983476 KERNELBASE!SetUnhandledExceptionFilter+0x24a
02 0000006d`6dcdf420 00007ff6`d62a5414 KERNELBASE!SetUnhandledExceptionFilter+0x16e
03 0000006d`6dcdf690 00007ff6`d62a5f50 MicroLogUtil!__crtSetUnhandledExceptionFilter+0x14
04 0000006d`6dcdf6c0 00000000`00000000 MicroLogUtil!__CxxSetUnhandledExceptionFilter+0x10
0:000> * Nope.  Okay... skipping ahead to the correct one:  MicroLogUtil!_lock+0x50
0:000> kn = 0000006d`6dcdf438
 # Child-SP          RetAddr           Call Site
00 0000006d`6dcdf438 00007ff6`d6295a90 ntdll!RtlEnterCriticalSection+0x22
01 0000006d`6dcdf440 00007ff6`d6299232 MicroLogUtil!_lock+0x50
02 0000006d`6dcdf470 00007ff6`d6299ab9 MicroLogUtil!_heap_alloc_dbg_impl+0x32
03 0000006d`6dcdf4e0 00007ff6`d6299a49 MicroLogUtil!_nh_malloc_dbg_impl+0x39
04 0000006d`6dcdf530 00007ff6`d6294b7a MicroLogUtil!_nh_malloc_dbg+0x49
05 0000006d`6dcdf580 00007ff6`d62915e6 MicroLogUtil!malloc+0x2a
06 0000006d`6dcdf5d0 00007ff6`d62919e8 MicroLogUtil!AstNodeCreate+0x26
07 0000006d`6dcdf610 00007ff6`d6291115 MicroLogUtil!AstStateInit+0x48
08 0000006d`6dcdf640 00007ff6`d6291473 MicroLogUtil!ProcessFile+0x85
09 0000006d`6dcdf730 00007ff6`d629549c MicroLogUtil!wmain+0x33
0a 0000006d`6dcdf770 00007ff6`d62955de MicroLogUtil!__tmainCRTStartup+0xec
0b 0000006d`6dcdf7c0 00007fff`bcc916ad MicroLogUtil!wmainCRTStartup+0xe
0c 0000006d`6dcdf7f0 00007fff`bd2734a5 KERNEL32!BaseThreadInitThunk+0xd
0d 0000006d`6dcdf820 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
0:000> * Hey yo!  That's looking like a correct stack!