Optimisations to the pdu queues
When making many concurrent requests (as is likely in any performance
criticial application), the use of SLIST_REMOVE and SLIST_ADD_END are
a severe bottleneck because of their linear search.
I considered using a double-linked list but it was unnecessary to
allocate the additional memory for each list entry.
Instead, continue to use a single-linked list but retain:
* a pointer to the end of the list; and
* a pointer to the previous entry during a linear search.
The former would makes append operations O(1) time, and the latter
does the same for removal. We can do this because removal only happens
within the linear search, and there is no random access to the queue.