Wednesday, December 7, 2016
Every Thread Starts with an IRET
What's an IRET? It's an Interrupt RETurn instruction. It tells the CPU that it should be returning from an interrupt and as such, needs to pop some additional state off the stack. The funny thing is part of this state is which ring protection level to resume. You can transition to higher (and thus lower privilege) RPLs using IRET. It's also apparently the only way to perform this transition.
So... something even cooler (at least to me). Every single time a thread is reactivated from a context switch it is actually resuming via an IRET instruction. The symmetry of this made me happy. If a user mode thread issues a system call (Sleep, WaitForSingleObject, ReadFile, WriteFile, etc.) that's an interrupt! If a thread runs out of quantum, it will get kicked out by a timer interrupt.
The tricky part is getting the stack set up in such a way that the IRET pops off all the register context just how you need it to. Luckily, this is actually pretty easy (the instruction which triggered the interrupt did that for us). It's quite a bit harder to actually start a thread, because all those values need to be properly initialized by the kernel code.
Because I know everyone is curious, here's some additional resources:
Tuesday, April 28, 2015
Tabs! No, SPACES!! NO!!! TABS!!!!11111one..dot..
So, where are the major arguments in software engineering? Here's a few I've encountered and how I feel about them. (I'm not a journalist; I can offer up my opinion. Most journalists seem to offer up their opinion as well these days so whatever.) Of course, I try to temper my own opinions when discussing this with my mentee.
Tabs! No, SPACES! No, TABS!
This isn't as prevalent as it used to be. Other than at my university over 10 years ago I've only encountered it once in actual industry work. An engineer absolutely insisted on using tabs because one tab character took up less space on disk than four space characters. I prefer to be able to print code, open it in ANY text editor, or various other things that pretty much require uniform spacing -- which means spaces rather than tabs. The engineer who argued vehemently for tabs was also the kind of guy you'd probably expect to find hand-tuning assembly to optimize the E_UNIVERSE_HEAT_DEATH error handler mentioned below. In my opinion, one of those kind of engineers you wish would just stay in his corner mumbling about punch cards, FORTRAN, and the origin of the term "core dump."
Crystal palace coding
The idea your code (and everyone else's) must be absolutely perfect. If there's a trailing space at the end of a line it's not a NIT1 it's a giant catastrophe and "people could literally DIE if you don't fix this space at the end of the line." There's also the issue where braces go, how case statements are formatted, CamelCase vs. under_scores and other naming conventions, and numerous other horrible things that are completely swallowed by the lexical parsing stage of the compiler. Don't get me wrong though, there are good reasons to have coding standards and the dev team should also stick to the coding standards. However, since the end customer doesn't care about the coding standards at all there are probably better areas to invest which will surprise and delight the customer. The best part about these coding standards is once you switch teams or companies there will be an entirely new set of them and arguments why the new ones are the best...
E_UNIVERSE_HEAT_DEATH
Handling every single possible error case no matter how ridiculous. Sometimes I'll hear someone ask how code will handle something completely insane. My favorite is out of memory issues. "You should throw an exception if this allocation fails." -- Really? We should try to allocate an exception object after an allocation failed? How about if we just crash instead? Is it really all that bad to crash during a catastrophic error? Is someone going to catch that OOM exception and do something insanely clever with it so we can continue proper execution again? The much more likely case is the error will bubble up a few callstacks below and then someone will spend hours binary-search debugging the originator of the error code. That's an unbelievable waste of time just to adhere to someone's (I'm looking at you Bjarne Stroustrup!) ideal method of error handling. Sometimes I want to ask in response, "How do we detect if someone jams their penis into the CD-ROM tray? Should we define E_PENIS_NOT_EXPECTED and provide an appropriate message file to display helpful error text to the user? Can we DCR the CD-ROM firmware to detect illegal penis insertions?"
In summary...
I'll probably come back to this post and update it with new insights. It'll be fun to see how my views change over the years.
- NIT is short for "nitpick" in a code review. Something that certainly doesn't need to be changed unless you're churning the code again. Basically, I prefer it this way but I don't really care.
Monday, March 16, 2015
Amazon: Why not outsource?
From: Jon To: XXXXXX@amazon.com Subject: RE: Amazon's XXXXXXXXXXXXXXXXXXXX team is hiring! Date: Mon, 16 Mar 2015 21:40:49 -0700 Hi XXXXXX, [TL;DR: Perhaps Amazon should outsource their engineering work.] Thanks for contacting me. I think it's wise to always be on the lookout for new opportunities. However, I think Amazon has some great challenges with respect to staffing. Here's a few reasons why I won't consider a position there at this time: First and foremost, I LOVE my current job and the company where I work. It would take quite a bit to pull me away. Not just money, but something interesting, challenging, and fun. I'm challenged daily, I get to work on things I genuinely believe help customers daily, and I'm passionate about what I do. Amazon doesn't have paid parental leave. At least it didn't when I asked ~12 months ago. This speaks a tremendous amount about the culture of Amazon. An employee told me most managers are willing to "let you work from home for a few weeks." The fact that paid parental leave isn't an official benefit at Amazon gives me the impression that Amazon doesn't care or acknowledge I may actually have a life outside of work. It's also pretty much standard at the top end of the industry to offer at least 4 weeks... If it's not a company wide policy, then the company wide policy is that it doesn't exist. Policy dictates culture; much the same way it dictates results. I've heard that Amazon pays so well these other benefits aren't needed. I should take my larger salary and apply it to the benefits I want. I feel the culture this policy dictates is one much more suited to an outsourcing contract than a long-term employee/employer relationship. Amazon's attrition is insane. I remember reading an article which reported the average employee tenure at Amazon is 12 months. This doesn't bode well for anyone at the company. For new hires, it indicates they might as well start their next job search the first day they are hired. For veterans it means they will constantly be ramping up new hires and imparting all that tribal knowledge which inevitably grows up around a code base. Once the veterans are lost all that tribal knowledge may very well be lost. Nobody will know about the easiest way to plug in new feature ABC into the existing architecture or fix bug XYZ. So now even the company doesn't make out well in the end because the new hires are grafting features or bug fixes onto a code base with no guidance. I don't envy anyone who must maintain the patchwork disaster of a system that environment eventually fosters. The only way this isn't an issue is if the work you're doing is so thoroughly uninteresting that no real engineering is required and cookie cutter templates apply to all the work. In which case Amazon should outsource. When I asked a manager what the deal was with the attrition I was told that "Amazon hires such good engineers that other companies constantly steal them away." Amazon wants the very best in the industry but the compensation and/or work environment is such that it sends them fleeing in the arms of another? I understand Amazon pays quite well, so it must be the work environment. This statement is either part of an HR script or someone has their head stuck somewhere it doesn't belong -- hopefully in the sand since the alternative is most unpleasant. I ran a quick search of my LinkedIn connections for those having worked at Amazon or currently working there. Ten of my contacts have worked there and only one is still working there. I'll be generous and say that's a 10% success rate. Or perhaps I just have sub-par LinkedIn connections -- they are after all, connected to me... Amazon's leadership. The one and only redeeming thing about Amazon is the focus on customers. Amazon loves their customers. It's clear in the service provided. It's clear in the leadership principals published on the careers page. It's even clear in the story I've heard about those "banged" emails Mr. Bezos will forward when a customer complains. However, if Amazon has a "1 out of 11 found this career acceptable" review. Where's the banged email for that? How vocally self critical is the senior leadership team of their atrocious employee retention? Or is that not really what the senior leadership team cares about? Do they actually want a workforce that churns every 12 months? Do they really care little for the work-life balance? If so, why not outsource the work? Amazon doesn't seem to want long-term employees. If cogs in the wheel are good enough why not just outsource the work? Thanks, -- Jon -------------------------------------------------------------------------- From: XXXXXXX@amazon.com To: Jon Subject: Amazon's XXXXXXXXXXXXXXXXXX team is hiring! Date: Wed, 11 Mar 2015 22:19:18 +0000 Hi Jon- I came across your resume today and was impressed with your experience and your education. I am looking for senior talent such as yourself to staff a new initiative for Amazon’s XXXXXXXXXXXXXX team. The position is located in Seattle. If you want to be where decisions are made and solve problems for customers I think this is a challenge you would really enjoy. Amazon's culture has been a great fit for me and one I think you would like. I would love to talk with you further. Can you suggest a time to connect? XXXX XXXXXX | Technical Sourcer | Amazon E: XXXXXX@amazon.com Phone: (206) XXX-XXXX Work hard. Have fun. Make history.
Wednesday, February 11, 2015
I'd rather...
For example:
a++;
This should not throw an out of memory exception. Before opening the whitepaper I wrote:I'd rather fashion a knife out of a salt block with an open wound and then stab myself repeatedly in the eye until brain matter leaks out than read up on how C++ can finally be made sane if we only follow these 40 simple rules.
Well the jokes on me because when I finally opened the whitepaper it was 72 pages long.
Also, I'm in pretty good company: http://harmful.cat-v.org/software/c++/
Saturday, September 6, 2014
Side projects... what next?
Lately I've been having a hard time deciding which side project to work on at home. Of course, having a seven month old has severely limited how much time I have for side projects, but I find it's still important to work out some "work-suppressed" creativity. A coworker from my IGT days has been wanting to work on a game for a few weeks now. He's been working with Unity and probably wants to work out some of his own work-suppressed creativity. This guy is an awesome 3D artist / modeler. I really enjoyed working with him at IGT. We worked together on the last large project I did at IGT before joining Microsoft and I had a lot of fun. He taught me about skeletal animation and we worked together to extend an existing toolchain to support it. That was some of the most fun I've had in a job.
However, I'm also becoming extremely interested in operations systems. I've taken my "OS" to the point where I really need to start making decisions about early boot environment and handoff to the kernel. Lately, I've been working on a FAT32 format utility specific to my OS. I need a custom format utility so I can add my own boot sector code and lay out the supplemental sectors accordingly. FAT32 is surprisingly simple, but also not all that interesting so the project has stalled here for a bit. Perhaps I'll just skip this and use the same 10MB "chunk" for booting as I have been. The problem is I want to start adding executable image and loader support, along with other things, so I figure I might as well have a real filesystem to work against.
There's a few other smaller projects I've been thinking about as well. The diff utility based on a "rolling hash" function. I had a beta of this working in C# with WPF rendering and it worked well. My next step is to turn it into a C library / DLL which I can use in multiple other utilities I'd like to create. I've also looked at making a code editor a few times. I'll usually get hung up on Unicode support because it's one of those things you really kind of need to plan for up front. While this may not have resulted in much of a code editor, I certainly learned a bunch about Unicode and also thought up the concept of an index tree. I've also thought about starting up work on my fractal terrain generation and perlin noise algorithms again. I was using XNA and there was some limitation on the type of integer calculations I perform efficiently in the shader. I can't remember the details since it was a couple years ago, but I'd definitely like to revisit using newer DirectX or OpenGL shaders which may not have the same limitation I encountered.
Yep... A lot of choices. I also had the beginnings of 3D graphics rendering engine in the works -- it was using Blender as the creation framework and I had python scripts to export into the engine formats. It was coming along. What I'd really like would be to have a job where I could work on one of these things to the point it's useful.
Friday, September 5, 2014
Microtip: Stack corruption? windbg dps to the rescue.
Everyone encounters it sooner or later. A crash with a corrupted callstack. You know what the stack pointer is, but it is clearly not unwinding properly. With a little elbow grease I've found the WinDbg command "dps" to be extremely helpful in this situation.
With this, you're telling the debugger to dump data, in "pointer size" chunks, and also try to match against loaded symbols. You'll quickly start seeing *some* sort of callstack with this display. Now the trick is to sort through the various locations to find the actual callstack. Windbg can help with this as well, you can tell it where the stack actually starts using the kn = [base address] command to verify if what you think is a good stack really is one. The trick is to look at address offsets. Pretty much all real function return addresses will have a non-zero offset. Usually, there are a small enough number of these that you can just brute force it -- like you see me doing (although I skipped a few for brevity). If there are a ton of these potential addresses, it may require additional detective work. Also, one important gotcha -- there may appear to be more than one valid stack, so at this point you need to read the source code or disassembly, and determine what makes the most sense. Also, knowing the ABI of the platform you're working on is invaluable.
This is a little tedious, but the worst bugs only rarely repro and can hold up a product release...
0:000> kn
# Child-SP RetAddr Call Site
00 0000006d`6dcdf310 0000006d`6dd80000 ntdll!RtlEnterCriticalSection+0x22
01 0000006d`6dcdf318 6f6e6d6c`402c0062 0x0000006d`6dd80000
02 0000006d`6dcdf320 00000000`00000470 0x6f6e6d6c`402c0062
03 0000006d`6dcdf328 00000000`00000480 0x470
04 0000006d`6dcdf330 00000000`00000000 0x480
0:000> * Oh noooooooo!!!!
0:000> dps 0000006d`6dcdf310
0000006d`6dcdf310 0000006d`6dd80000
0000006d`6dcdf318 6f6e6d6c`402c0062
0000006d`6dcdf320 00000000`00000470
0000006d`6dcdf328 00000000`00000480
0000006d`6dcdf330 00000000`00000000
0000006d`6dcdf338 0000006d`6dcdf428
0000006d`6dcdf340 77767574`73727170
0000006d`6dcdf348 00007fff`ba9738d8 KERNELBASE!VirtualQuery+0x28
0000006d`6dcdf350 87868584`83828180
0000006d`6dcdf358 8f9e8d9c`8b9a8988
0000006d`6dcdf360 97969594`93929190
0000006d`6dcdf368 ff9e9d9c`9b9a9998
0000006d`6dcdf370 00000000`00000030
0000006d`6dcdf378 0000006d`6dcdf3a8
0000006d`6dcdf380 0000006d`6dcdf440
0000006d`6dcdf388 00007fff`ba983552 KERNELBASE!SetUnhandledExceptionFilter+0x24a
0:000> dps
0000006d`6dcdf390 e7e6e5e4`e3e2e1e0
0000006d`6dcdf398 efeeedec`ebeae9e8
0000006d`6dcdf3a0 d7f6f5f4`f3f2f1f0
0000006d`6dcdf3a8 00000000`00000030
0000006d`6dcdf3b0 00000000`0000021a
0000006d`6dcdf3b8 00000000`00000000
0000006d`6dcdf3c0 00007ff6`d6290000 MicroLogUtil!__ImageBase
0000006d`6dcdf3c8 01000000`00000080
0000006d`6dcdf3d0 00000000`00095000
0000006d`6dcdf3d8 00007fff`bd24dcb7 ntdll!RtlDecodePointer+0x27
0000006d`6dcdf3e0 00000000`00000000
0000006d`6dcdf3e8 00000000`00000000
0000006d`6dcdf3f0 00000000`00000000
0000006d`6dcdf3f8 00000000`00000000
0000006d`6dcdf400 00000000`00000000
0000006d`6dcdf408 00007ff6`d62a5ec0 MicroLogUtil!__CxxUnhandledExceptionFilter
0:000> dps
0000006d`6dcdf410 000012b3`e0966000
0000006d`6dcdf418 00007fff`ba983476 KERNELBASE!SetUnhandledExceptionFilter+0x16e
0000006d`6dcdf420 00000000`00000000
0000006d`6dcdf428 00000000`959f04b3
0000006d`6dcdf430 0000006d`6dd86480
0000006d`6dcdf438 00007ff6`d6295a90 MicroLogUtil!_lock+0x50
0000006d`6dcdf440 00007ff6`d6290000 MicroLogUtil!__ImageBase
0000006d`6dcdf448 00000000`00095000
0000006d`6dcdf450 03810381`01000000
0000006d`6dcdf458 00000000`005a0058
0000006d`6dcdf460 0000006d`6dcdf468
0000006d`6dcdf468 00007ff6`d6299232 MicroLogUtil!_heap_alloc_dbg_impl+0x32
0000006d`6dcdf470 00000000`00000004
0000006d`6dcdf478 00007fff`bd2e1ebc ntdll!RtlZeroHeap+0x6e8
0000006d`6dcdf480 006b0073`00690064
0000006d`6dcdf488 0075006c`006f0056
0:000> dps
0000006d`6dcdf490 005c0032`0065006d
0000006d`6dcdf498 005c0070`006d0074
0000006d`6dcdf4a0 00720063`0069004d
0000006d`6dcdf4a8 0067006f`10070017
0000006d`6dcdf4b0 00000000`6dd86310
0000006d`6dcdf4b8 00007fff`bd2be5e7 ntdll!memset+0x1fa27
0000006d`6dcdf4c0 00000000`00000000
0000006d`6dcdf4c8 00000000`00000001
0000006d`6dcdf4d0 0000006d`6dd80000
0000006d`6dcdf4d8 00007ff6`d6299ab9 MicroLogUtil!_nh_malloc_dbg_impl+0x39
0000006d`6dcdf4e0 00000000`00000030
0000006d`6dcdf4e8 00000000`00000001
0000006d`6dcdf4f0 00000000`00000000
0000006d`6dcdf4f8 00007fff`00000000
0000006d`6dcdf500 0000006d`6dcdf560
0000006d`6dcdf508 00000000`00000000
0:000> dps
0000006d`6dcdf510 00000000`00000000
0000006d`6dcdf518 02100210`02100210
0000006d`6dcdf520 00000000`00000101
0000006d`6dcdf528 00007ff6`d6299a49 MicroLogUtil!_nh_malloc_dbg+0x49
0000006d`6dcdf530 00000000`00000030
0000006d`6dcdf538 00007ff6`00000000
0000006d`6dcdf540 00000000`00000001
0000006d`6dcdf548 00000000`00000000
0000006d`6dcdf550 00000000`00000000
0000006d`6dcdf558 0000006d`6dcdf560
0000006d`6dcdf560 00000000`00000000
0000006d`6dcdf568 00000000`00000000
0000006d`6dcdf570 00000000`00000000
0000006d`6dcdf578 00007ff6`d6294b7a MicroLogUtil!malloc+0x2a
0000006d`6dcdf580 00000000`00000030
0000006d`6dcdf588 0000006d`00000000
0:000> dps
0000006d`6dcdf590 0000006d`00000001
0000006d`6dcdf598 00000000`00000000
0000006d`6dcdf5a0 00000000`00000000
0000006d`6dcdf5a8 00007ff6`d6295db2 MicroLogUtil!_unlock+0x22
0000006d`6dcdf5b0 00000000`00000000
0000006d`6dcdf5b8 02100302`00000030
0000006d`6dcdf5c0 00007ff6`00000000
0000006d`6dcdf5c8 00007ff6`d62915e6 MicroLogUtil!AstNodeCreate+0x26
0000006d`6dcdf5d0 00000000`00000030
0000006d`6dcdf5d8 00007ff6`d62999b4 MicroLogUtil!_msize_dbg+0x234
0000006d`6dcdf5e0 0000006d`00000004
0000006d`6dcdf5e8 00007ff6`d62aa677 MicroLogUtil!_setmbcp_nolock+0x447
0000006d`6dcdf5f0 00000000`00000000
0000006d`6dcdf5f8 00007fff`bd265a63 ntdll!RtlEncodePointer+0x27
0000006d`6dcdf600 00000000`00000100
0000006d`6dcdf608 00007ff6`d62919e8 MicroLogUtil!AstStateInit+0x48
0:000> dps
0000006d`6dcdf610 00000000`00000000
0000006d`6dcdf618 00007ff6`d631bec0 MicroLogUtil!pairNode `RTTI Type Descriptor'+0x5a0
0000006d`6dcdf620 00000000`00000001
0000006d`6dcdf628 0000006d`6dcdf680
0000006d`6dcdf630 00000000`00000000
0000006d`6dcdf638 00007ff6`d6291115 MicroLogUtil!ProcessFile+0x85
0000006d`6dcdf640 0000006d`6dcdf6c0
0000006d`6dcdf648 0000006d`6dd827c8
0000006d`6dcdf650 00000000`00000000
0000006d`6dcdf658 00007fff`bd265a63 ntdll!RtlEncodePointer+0x27
0000006d`6dcdf660 00000000`00000002
0000006d`6dcdf668 00007ff6`d6307020 MicroLogUtil!`string'
0000006d`6dcdf670 0000006d`00000000
0000006d`6dcdf678 00000000`00000000
0000006d`6dcdf680 00000000`00000000
0000006d`6dcdf688 00007ff6`d62a5414 MicroLogUtil!__crtSetUnhandledExceptionFilter+0x14
0:000> dps
0000006d`6dcdf690 00000000`00000000
0000006d`6dcdf698 00000000`00000000
0000006d`6dcdf6a0 00000000`00000000
0000006d`6dcdf6a8 00000000`00000000
0000006d`6dcdf6b0 00007ff6`d62a7810 MicroLogUtil!_RTC_Terminate
0000006d`6dcdf6b8 00007ff6`d62a5f50 MicroLogUtil!__CxxSetUnhandledExceptionFilter+0x10
0000006d`6dcdf6c0 00000000`00000000
0000006d`6dcdf6c8 00000000`00000000
0000006d`6dcdf6d0 00000000`00000000
0000006d`6dcdf6d8 00000000`00000000
0000006d`6dcdf6e0 00000000`00000000
0000006d`6dcdf6e8 00000000`00000000
0000006d`6dcdf6f0 00000000`00000000
0000006d`6dcdf6f8 00000000`00000000
0000006d`6dcdf700 00000000`00000000
0000006d`6dcdf708 00000000`00000000
0:000> * Hmmm, that looks a little more promising...
0:000> * Looking for offsets that make sense... is it the VirtualQuery+0x28???
0:000> kn = 0000006d`6dcdf348
# Child-SP RetAddr Call Site
00 0000006d`6dcdf348 00007fff`ba9738d8 ntdll!RtlEnterCriticalSection+0x22
01 0000006d`6dcdf350 00007fff`ba983552 KERNELBASE!VirtualQuery+0x28
02 0000006d`6dcdf390 00007fff`ba983476 KERNELBASE!SetUnhandledExceptionFilter+0x24a
03 0000006d`6dcdf420 00007ff6`d62a5414 KERNELBASE!SetUnhandledExceptionFilter+0x16e
04 0000006d`6dcdf690 00007ff6`d62a5f50 MicroLogUtil!__crtSetUnhandledExceptionFilter+0x14
05 0000006d`6dcdf6c0 00000000`00000000 MicroLogUtil!__CxxSetUnhandledExceptionFilter+0x10
0:000> * Hmm, doesn't look like it. How about KERNELBASE!SetUnhandledExceptionFilter+0x24a?
0:000> kn = 0000006d`6dcdf388
# Child-SP RetAddr Call Site
00 0000006d`6dcdf388 00007fff`ba983552 ntdll!RtlEnterCriticalSection+0x22
01 0000006d`6dcdf390 00007fff`ba983476 KERNELBASE!SetUnhandledExceptionFilter+0x24a
02 0000006d`6dcdf420 00007ff6`d62a5414 KERNELBASE!SetUnhandledExceptionFilter+0x16e
03 0000006d`6dcdf690 00007ff6`d62a5f50 MicroLogUtil!__crtSetUnhandledExceptionFilter+0x14
04 0000006d`6dcdf6c0 00000000`00000000 MicroLogUtil!__CxxSetUnhandledExceptionFilter+0x10
0:000> * Nope. Okay... skipping ahead to the correct one: MicroLogUtil!_lock+0x50
0:000> kn = 0000006d`6dcdf438
# Child-SP RetAddr Call Site
00 0000006d`6dcdf438 00007ff6`d6295a90 ntdll!RtlEnterCriticalSection+0x22
01 0000006d`6dcdf440 00007ff6`d6299232 MicroLogUtil!_lock+0x50
02 0000006d`6dcdf470 00007ff6`d6299ab9 MicroLogUtil!_heap_alloc_dbg_impl+0x32
03 0000006d`6dcdf4e0 00007ff6`d6299a49 MicroLogUtil!_nh_malloc_dbg_impl+0x39
04 0000006d`6dcdf530 00007ff6`d6294b7a MicroLogUtil!_nh_malloc_dbg+0x49
05 0000006d`6dcdf580 00007ff6`d62915e6 MicroLogUtil!malloc+0x2a
06 0000006d`6dcdf5d0 00007ff6`d62919e8 MicroLogUtil!AstNodeCreate+0x26
07 0000006d`6dcdf610 00007ff6`d6291115 MicroLogUtil!AstStateInit+0x48
08 0000006d`6dcdf640 00007ff6`d6291473 MicroLogUtil!ProcessFile+0x85
09 0000006d`6dcdf730 00007ff6`d629549c MicroLogUtil!wmain+0x33
0a 0000006d`6dcdf770 00007ff6`d62955de MicroLogUtil!__tmainCRTStartup+0xec
0b 0000006d`6dcdf7c0 00007fff`bcc916ad MicroLogUtil!wmainCRTStartup+0xe
0c 0000006d`6dcdf7f0 00007fff`bd2734a5 KERNEL32!BaseThreadInitThunk+0xd
0d 0000006d`6dcdf820 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
0:000> * Hey yo! That's looking like a correct stack!
Friday, August 29, 2014
What makes a good job?
Teammates
I've pretty much always stayed in a team well after the expiration date because of great team members. I often find myself missing peers more than any other positive aspects of a job. Because of this I feel my peers are an extremely important factor in job satisfaction.Good teammates turn into family friends and lifelong contacts. I recently had lunch with a guy I worked with over 3 years ago. It was the highlight of my week and I'd love to work with him again. If you want to know if you're in a cohesive team you only need to count the number of times team members have done things together away from work.
Technology
If I'm honest with myself I think I have to admit something: What I'm working on seems to be much more of a deciding factor when choosing the next job than contributing to actual happiness once there. I get caught up looking for the next shiny thing. However, there is some practicality in allowing technology to influence my next role. Usually, I try to choose a technology area that I believe will be most useful for my long term career. This is when I try to make sure I don't get stuck or pigeon holed into one area. I remember an engineer I worked with who was getting progressively bitter about always working on the same thing. He hated what he was being asked to work on but didn't seem to want to change positions. I suppose this is the most important reason to consider technology when choosing a job. You don't want to whither away as an expert of a dead-end technology.Manager
My manager is probably the single most important influence on job satisfaction. Unfortunately, companies (and their managers) don't seem to understand this fact very often. I've worked in company cultures where the only way to grow was to enter management. This is horrible. It takes a company's most productive employees and mashes them into a completely different area of expertise -- one in which they most likely are not qualified. One great thing about Microsoft is the dual path they support. You can be a "people manager" or you can be an "individual contributor" (IC) and obtain the same compensation.In my opinion, even worse than the employee who enters management because company culture doesn't permit IC growth is the employee who enters management for the prestige of it. These "prestige managers" seem to think being a boss is a more important role than being an IC. The reason this is the worst possible situation is these people don't understand a fundamental rule of leading: The leader works for the team.
Someone looking for prestige in a leadership role is under the impression the team works for the leader. Paradoxically, the only time the team truly will work for the leader is when the leader has won the respect and admiration of the team members. This does not happen through silly mandates, make-work meetings and tasks, managing appearances rather than results, or overbearing "guidance." It does not come from business buzzwords or MBA degrees. It comes from facilitating the team members. It comes from getting out of the way when you're a hindrance, proactively removing obstacles, and internalizing team member requirements. A bad leader will say, "I hope you're empowered to fix this." A good leader will not need to talk about empowerment because their ICs already are.
At one point I naively thought these misguided managers simply never had the benefit of an excellent manager mentor. I don't believe that's the case anymore. I've personally observed multiple examples of ICs with solid examples of good leadership turn into some of the worst prestige managers. They are incapable of observing what makes the solid leaders excellent. Since they are blind to these traits they are also incapable of applying the principles in their team. It's easy to spot these managers. Their teams bleed talent and whither. Team members who stick around are demoralized, have low job satisfaction, and through apathy eventually give up trying to make any positive change. A bad leader can poison a team or organization the way cancer moves through the body.
Partner teams
Much like the culture of a team, the culture of partner teams is extremely important. For the same reasons a bad manager introduces a cancer to the team partner teams do the same. Partner teams are by definition peers or customers. Nobody wants to work with a jerk and nobody wants a jerk for a customer. Talented ICs will eventually leave a team forced to deal with horrible partner teams.Work-Life balance
"I don't live to work, I work to live." I believe a balanced personal life creates many more motivated employees and directly impacts the success of a company, organization, or team. For me, when work cuts into my personal time I end up despising work. More importantly, when work-life balance shifts too far toward work it's an indicator the team, organization, or company will eventually fail. ICs are asked to work too much because of scheduling / planning issues or staffing issues.If a team, organization, or company has poor planning the quality of the final product will eventually suffer. Quality will suffer because bugs are not caught (or are caught too late to fix and reset testing to make a release date). Quality will also eventually suffer because the best employees will eventually leave. Common "fire drills" point to a process or cultural issue which the team, organization, or company cannot or is not motivated enough to change.
If a team, organization, or company has a staffing issue one must ask why. Do they suffer from high attrition? Do they not compensate properly? Do they expect too much of ICs? Can they not find qualified candidates? All of these are related to the simple fact that it's not a good place to work. Most of the issues behind the questions above are obvious, but what about the lack of qualified candidates? I believe there are quite a few extremely smart people out there looking for something new. If you can't entice quality candidates how do you expect to keep current talent?