Skip to content
Commit 02eafc19 authored by Declan Snyder's avatar Declan Snyder Committed by Johan Hedberg
Browse files

drivers: eth_nxp_enet_qos: Fix deadlock in system workqueue



There was a deadlock occurring, exposed by http server sample because of
situations like this caused by tx done work being blocked in deadlock:
1) The TX would be started by some thread and the driver TX sem would be
   taken.
2) The http server socket would get scheduled on the system workqueue to
   send something, claim the network TX interface mutex,
   and be blocked taking the semaphore.
3) The RX traffic class handler would get blocked trying to claim the
   network interface TX mutex, while trying to send an ACK in the TCP
   callback. This means the RX packets would not be processed.
4) Lots of RX unable to allocate packets errors would happen, and all RX
   would be dropped. This was the main symptom of the deadlock, which
   made it look like a memory leak but actually had nothing to do with
   the RX code nor any memory leak.
5) The TX DMA would finish and schedule the TX DMA done work onto the
   system work queue, behind the http server socket which is blocked on
   the waiting for the driver TX semaphore.
6) If the TX DMA done work would have ran, that's what gives the TX
   driver semaphore. So this is the reason for the deadlock of all these
   different threads and work items, the misqueue in the system
   workqueue.

Fix by just calling the TX DMA done code directly from the ISR, it
should be ISR safe, and really not a lot of code to execute, just
freeing some net buffers and the packet and updating the stats.
An optimization can be made later if needed, but for now,
solving the deadlock is a more urgent priority.

Signed-off-by: default avatarDeclan Snyder <declan.snyder@nxp.com>
parent c96d891f
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment