Monday, January 30, 2012


While working on a streaming engine I came across an interesting little hole in the MSDN documentation for Completion Ports. Completion Ports allow for extremely efficient throughput of data. The way this is accomplished is by queuing IO to a "Completion Port" and then associating the Completion Port with one or more threads.

The reason this is so fast is because Windows can then chose which thread will complete the IO operation. Using a thread pool allows Windows to always pick the last executed thread in LIFO order. This greatly reduces TLB thrashing and other issues associated with a context switch on the CPU. When the threads aren't processing IO they are in a wait state. The IO is processed in a FIFO order.

The usual Completion Port architecture looks something like this:
  1. Create a Completion Port using CreateIoCompletionPort.
  2. Create the threads for the thread pool and call GetQueuedCompletionStatus to associate the threads with the Completion Port.
  3. Associate file HANDLEs (opened in Overlapped IO mode) to the Completion Port.
  4. Issue IO operations using ReadFile/WriteFile.
  5. Process the IO operations in the thread pool threads.

When any IO operation completes Windows will smartly choose a thread waiting on GetQueuedCompletionStatus to wake up and send the IO result. The call to GetQueueCompletionStatus will return and data processing can begin. Ideally, an application would probably only have one Completion Port and perform all IO processing on this port/thread pool pair.

Everything about this is awesome, except... The documentation is really vague about how to handle ReadFile/WriteFile operations returning success (and thus not being queued). You need to make sure you call GetOverlappedResult (and probably with the Wait parameter set to FALSE) or you will start getting strange errors.

After a few of these immediate IO completions my streaming engine started hitting ReadFile failures described by "ERROR_WORKING_SET_QUOTA." Nowhere in the documentation for Completion Ports or GetOverlappedResult does it indicate this should be called in the Completion Port case. I suppose it's implied by the fact that you're using Overlapped IO, but still an explicit indication on MSDN would probably be useful.

This may be obvious to some but I wasted about an hour on this, so hopefully this post will shorten that time for someone else.