drivers: usb_dc_rpi_pico: fix stability issue related to control transfers
Description of the bad behaviour before this change:
The arming of the control EP0_OUT endpoint was not kept under
control. It could happen that the EP0_OUT endpoint was left
armed, after the completion of a complete control transfer.
It is clear that the intention was to NOT keep EP0_OUT constantly
armed while idle, because usb_dc_ep_enable() doesn't arm it,
and the intention was for when usb_dc_ep_read() is called to
collect the Setup-Stage 8-byte data, that is when EP0_OUT is armed,
and before this call is performed, the host will keep getting NAKs
for the Data-Stage of the to_device control transfer.
This happens correctly on the first to_device control transfer with
wLength > 0. However, because usb_dc_ep_read_continue() indiscriminately
re-arms all OUT endpoints, in the case of to_device control transfers
with wLength > 0, on the Data-Stage, the endpoint is also re-armed,
which is wrong, because then the endpoint will be left armed after
the control transfer is over.
In this case when a new to_device control transfer starts, the
Data-Stage will be accepted on the first try. This would still
have worked without a failure if the Setup-Stage would have been
processed immediately, but because we process everything in a work
queue at a later time, when the Setup-Stage associated 8-byte data
buffer is read both the Setup-Stage and Data-Stage have arrived.
At the end of handling the Setup-Stage we try to re-arm the EP0_OUT,
which already contains data, thereby corrupting the received length
portion of the buf_ctl register. (Obviously other fields are changed
too, but the length field is the one that first causes chaos, cause
it's written to the maximum, which is 64.) The above mentioned Data-Stage
already has a message in its workqueue for it to be processed, but
it is picked up only after the length field has been corrupted.
Because of this usb_dc_ep_read() thinks there is more data in the buffer
than there really is, and everything becomes de-synchronized, with
later reads accessing uninitialized parts of the buffer.
This sounds like a fundamental failure, that should make it impossible
to operate USB, however the reason this behaviour doesn't make it
impossible to enumerate the device is that this only affects
to_device control transfers with wLength > 0, and during enumeration
there are not many of those happening.
When enumerating a HID keyboard, there is only _one_ of those
happening, and it is the initial setting of the lock light led status.
And that first one succeeds because it's the first one. (However, later
lock light setting control transfers can cause problems, which is how
this problem was encountered.)
The solution in this commit is to keep better control over when EP0_OUT
is armed. This forces the Data-Stage to arrive later (the host will keep
re-trying), and that way the corruption of the buffer control register
is avoided.
Summary of the changes:
- Rework the logic around deciding wether to re-arm the out endpoint
after a read. For non-0 endpoint the previous behaviour is kept,
however for EP0 it is only re-armed if more OUT transactions are
expected for that SETUP transfer (be it data-stage or status-stage)
- Force un-arm the EP0_OUT endpoint in case a stall condition is observed.
- When a setup transfer is received check if EP0_OUT is already armed.
If armed then log a warning message, and force-disarm it.
- When a setup req interrupt fires, don't immediately force the next
read to get it, instead, it will be read only after a setup message
is extracted from the message queue.
- When a setup packet is received abort any unfinished previous control
transfers:
- cancel any data buffers given to the EP0_IN endpoint
- drop any new ep0_in writes that are attempted before this newest
setup packet's associated message is extracted from the message
queue.
- In the ISR, check buffer interrupts before setup req interrupts.
This is to make sure that the final 0-length status message from the
previous setup packet is consumed before the new setup packet.
(this is the only case now when both interrupts could be seen as
having fired by the time the interrupt handler routine executes.
Signed-off-by:
Purdea Andrei <andrei@purdea.ro>
Co-authored-by: Johann Fischer
Loading
Please sign in to comment