In Windows Vista, much of the network stack that ships with the OS uses much more stack than in previous versions of the operating system.

From my experience, just indicating a UDP datagram up to NDIS can require you to have over 4K of kernel stack available on x86, or you risk taking a double fault and causing the system to bugcheck.

For example, here’s a portion of the stack that I ran into while debugging an unrelated problem at the Vista compatibility lab:

0: kd> k100
ChildEBP RetAddr
818e6bdc 818ad19b RtlpBreakWithStatusInstruction
818e6c2c 818adc08 KiBugCheckDebugBreak+0x1c
818e6fdc 8184845e KeBugCheck2+0x5f4
818e6fdc 81871d35 KiTrap08+0x75
9c9cb084 8186dd14 SepAccessCheck+0x1e0
9c9cb0e0 81887907 SeAccessCheck+0x1a4
9c9cb51c 8715474c SeAccessCheckFromState+0xe4
9c9cb55c 871546d6 CompareSecurityContexts+0x47
9c9cb57c 87153b1a MatchValues+0xd4
9c9cb59c 87153aa7 CheckEqualConditionEnumMatch+0x3f
9c9cb63c 87153a1b MatchConditionOverlap+0x72
9c9cb660 87153774 FilterMatchEnum+0x6c
9c9cb674 8715948b FilterMatchEnumVisible+0x28
9c9cb6ac 87159520 IndexHashFastEnum+0x4d
9c9cb6f8 87158624 IndexHashEnum+0x139
9c9cb724 87159362 FeEnumLayer+0x7a
9c9cb7ac 87159b16 KfdGetLayerActionFromEnumTemplate+0x50
9c9cb7cc 8d6af9e4 KfdCheckAndCacheAcceptBypass+0x27
9c9cb8c4 8d6afc87 CheckAcceptBypass+0x146
9c9cb9a0 8d6b185d WfpAleAuthorizeReceive+0x82
9c9cba08 8d6ad542 WfpAleConnectAcceptIndicate+0x98
9c9cba74 8d6ad432 ProcessALEForTransportPacket+0xc5
9c9cbaf0 8d6ae6b3 ProcessAleForNonTcpIn+0x6f
9c9cbd28 8d6b0df0 WfpProcessInTransportStackIndication+0x2ab
9c9cbd78 8d6b0ae0 InetInspectReceiveDatagram+0x9a
9c9cbdfc 8d6b091c UdpBeginMessageIndication+0x33
9c9cbe44 8d6aecf3 UdpDeliverDatagrams+0xce
9c9cbe90 8d6aec40 UdpReceiveDatagrams+0xab
9c9cbea0 8d6acdd4 UdpNlClientReceiveDatagrams+0x12
9c9cbecc 8d6acba4 IppDeliverListToProtocol+0x49
9c9cbeec 8d6acad3 IppProcessDeliverList+0x2a
9c9cbf40 8d6ab443 IppReceiveHeaderBatch+0x1da
9c9cbfd0 8d6ac61d IpFlcReceivePackets+0xc06
9c9cc04c 8d6abf36 FlpReceiveNonPreValidatedNetBufferListChain
9c9cc074 8727b0b0 FlReceiveNetBufferListChain+0x104
9c9cc0a8 8726d737 ndisMIndicateNetBufferListsToOpen+0xab
9c9cc0d0 8726d6ae ndisIndicateSortedNetBufferLists+0x4a
9c9cc24c 871b53c3 ndisMDispatchReceiveNetBufferLists+0x129
9c9cc268 872802c4 ndisMTopReceiveNetBufferLists+0x2c
9c9cc2b4 b0a3fb4d ndisMIndicatePacketsToNetBufferLists+0xe9
From ndisMIndicatePacketsToNetBufferLists to where the system double faulted (in my case) inside of SeAccessCheck, a whopping
4656 bytes
of kernel stack were consumed.

So, now is the time to slim down your stack usage in your NDIS-related drivers, or you might be in for some unpleasant surprises when your drivers are used in conjunction with multiple third party IM drivers or the like (even better, you might investigate switching away from IM drivers and to the new filtering architecture). You should also be especially wary of any code that loops a packet that might potentially go back into tcpip.sys in a receive calling context (or any other context where you might have limited stack space available), as this can prove an unexpectedly expensive operation on Vista (and potentially beyond).

Oh, and a tip for finding stack hog functions with stack overflow problems: Use the ‘f’ flag with the ‘k’ command in WinDbg. For example:

0: kd> knf
 #   Memory  ChildEBP RetAddr
00           818e6bdc 818ad19b RtlpBreakWithStatusInstruction
01        50 818e6c2c 818adc08 KiBugCheckDebugBreak+0x1c
02       3b0 818e6fdc 8184845e KeBugCheck2+0x5f4
03         0 818e6fdc 81871d35 KiTrap08+0x75
This has the debugger compute the stack (arguments + locals) usage at each call frame point for you, saving you a bit of work with the calculator.