The Issue
I’ve found that sometimes controller command messages will time out even though they have already completed successfully.
Here’s a log example:
SendZwMessage: There are 0 messages in the pending queue ((nil)(0) (nil)(0) (nil)(0)) and a controller function is NOT executing 01:00:29.923 Write Msg D T 7641 N 9 ZwRequestNodeNeighborUpdate starting [with itv 20000 and otv 20000] 01:00:29.923 Write Msg X T - N - Controller Command fn: 0x48 node 9 01:00:29.923 Write Msg M T 7641 N 9 Write: 0x01 0x06 0x00 0x48 0x09 0xbb 0x5c 0x5f 01:00:29.924 Serial In M T - N - Read (controller->buffer): 0x06 01:00:29.925 Read Msg X T - N - Created message 0x95641e8 in HandleIncomingAck 01:00:29.925 Read Msg X T - N - Deleting message 0x95641e8 in SetMessageCompletionStatus 01:00:29.934 Serial In M T - N - Read (controller->buffer): 0x01 0x05 0x00 0x48 0xbb 0x21 0x28 01:00:29.935 Read Msg M T - N - Write: 0x06 01:00:29.935 Read Msg X T - N - Created message 0x95641e8 in HandleIncomingFrame 01:00:29.935 ProcessMsg R T - N 9 Node neighbor update has started. 01:00:29.935 ProcessMsg P T 7641 N 9 ZwRequestNodeNeighborUpdate is OuterCompleteSuccess 01:00:29.935 ProcessMsg X T - N - Deleting message 0xb5e6d570 in RemoveMessageFromPendingQueue2 01:00:29.936 ProcessMsg X T - N - SMQE 2 messages in the send queues, 0 pending, 1 controller 01:00:29.936 ProcessMsg X T - N - Deleting message 0x95641e8 in ProcessMessagesThread 01:00:35.505 Serial In M T - N - Read (controller->buffer): 0x01 0x05 0x00 0x48 0xbb 0x22 0x2b 01:00:35.505 Read Msg M T - N - Write: 0x06 01:00:35.505 Read Msg X T - N - Created message 0x95641e8 in HandleIncomingFrame 01:00:35.505 ProcessMsg X T - N - ZwController::HandleZwRequestNodeNeighborUpdate entered with NULL _pendingMsg. m_controllerCommandNode is 9. 01:00:35.505 ProcessMsg R T - N 9 Node neighbor update completed successfully. 01:00:35.506 ProcessMsg X T - N - SMQE 2 messages in the send queues, 0 pending, 0 controller 01:00:35.506 ProcessMsg X T - N - Deleting message 0x95641e8 in ProcessMessagesThread 01:00:49.923 Write Msg S T 7641 N 9 WriteThread Transaction timed out (after 20000ms) before achieving minimum threshold 2015-11-16 01:00:29.923 01:00:49.923 Write Msg P T 7641 N 9 $㵡 is OuterCompleteTimedOut
The problem appears to be that some controller commands are moved to the “dead node” message list and are mangled (appended to) when they are moved off this list.
Troubleshooting