Message ID | 9ce226b4-90c6-94c4-a5ad-bd623409bc51@ideasonboard.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Explicit status phase for DWC3 | expand |
On Tue, Jan 24, 2023, Dan Scally wrote: > Hi Thinh > > > I'm trying to update the DWC3 driver to allow the status phase of a > transaction to be explicit; meaning the gadget driver has the choice to > either Ack or Stall _after_ the data phase so that the contents of the data > phase can be validated. I thought that I should be able to achieve this by > preventing dwc3_ep0_xfernotready() from calling dwc3_ep0_do_control_status() > (relying on an "explicit_status" flag added to the usb_request to decide > whether or not to do so) and then calling it manually later once the data > phase was validated by the gadget driver (or indeed userspace). A very > barebones version of my attempt to do that looks like this: > We shouldn't do this. At the protocol level, there must be better ways to do handshake than relying on protocol STALL _after_ the data stage. Note that not all controllers support this. Thanks, Thinh
Hi Thinh On 26/01/2023 00:20, Thinh Nguyen wrote: > On Tue, Jan 24, 2023, Dan Scally wrote: >> Hi Thinh >> >> >> I'm trying to update the DWC3 driver to allow the status phase of a >> transaction to be explicit; meaning the gadget driver has the choice to >> either Ack or Stall _after_ the data phase so that the contents of the data >> phase can be validated. I thought that I should be able to achieve this by >> preventing dwc3_ep0_xfernotready() from calling dwc3_ep0_do_control_status() >> (relying on an "explicit_status" flag added to the usb_request to decide >> whether or not to do so) and then calling it manually later once the data >> phase was validated by the gadget driver (or indeed userspace). A very >> barebones version of my attempt to do that looks like this: >> > We shouldn't do this. At the protocol level, there must be better ways > to do handshake than relying on protocol STALL _after_ the data stage. > Note that not all controllers support this. Maybe I'm misunderstanding, but isn't this how the USB spec expects it to work? Reading "Reporting Status Results (8.5.3.1)" in the USB 2.0 spec for the status stage in a control write it says "The function responds with either a handshake or a zero-length data packet to indicate its current status", and the handshake can be either STALL or NAK. If we can't do this, how else can we indicate to the host that the data sent during a control out transfer is in some way invalid? Thanks Dan > > Thanks, > Thinh
On Thu, Jan 26, 2023, Dan Scally wrote: > Hi Thinh > > On 26/01/2023 00:20, Thinh Nguyen wrote: > > On Tue, Jan 24, 2023, Dan Scally wrote: > > > Hi Thinh > > > > > > > > > I'm trying to update the DWC3 driver to allow the status phase of a > > > transaction to be explicit; meaning the gadget driver has the choice to > > > either Ack or Stall _after_ the data phase so that the contents of the data > > > phase can be validated. I thought that I should be able to achieve this by > > > preventing dwc3_ep0_xfernotready() from calling dwc3_ep0_do_control_status() > > > (relying on an "explicit_status" flag added to the usb_request to decide > > > whether or not to do so) and then calling it manually later once the data > > > phase was validated by the gadget driver (or indeed userspace). A very > > > barebones version of my attempt to do that looks like this: > > > > > We shouldn't do this. At the protocol level, there must be better ways > > to do handshake than relying on protocol STALL _after_ the data stage. > > Note that not all controllers support this. > > > Maybe I'm misunderstanding, but isn't this how the USB spec expects it to > work? Reading "Reporting Status Results (8.5.3.1)" in the USB 2.0 spec for > the status stage in a control write it says "The function responds with > either a handshake or a zero-length data packet to indicate its current > status", and the handshake can be either STALL or NAK. If we can't do this, > how else can we indicate to the host that the data sent during a control out > transfer is in some way invalid? > My concern is from the documentation note[*] added from this commit: 579c2b46f74 ("USB Gadget: documentation update") It should be fine with dwc3 controllers. Did you look into delayed_status? BR, Thinh [*] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/udc/core.c#n255
On Thu, Jan 26, 2023 at 07:31:34PM +0000, Thinh Nguyen wrote: > On Thu, Jan 26, 2023, Dan Scally wrote: > > Hi Thinh > > > > On 26/01/2023 00:20, Thinh Nguyen wrote: > > > On Tue, Jan 24, 2023, Dan Scally wrote: > > > > Hi Thinh > > > > > > > > > > > > I'm trying to update the DWC3 driver to allow the status phase of a > > > > transaction to be explicit; meaning the gadget driver has the choice to > > > > either Ack or Stall _after_ the data phase so that the contents of the data > > > > phase can be validated. I thought that I should be able to achieve this by > > > > preventing dwc3_ep0_xfernotready() from calling dwc3_ep0_do_control_status() > > > > (relying on an "explicit_status" flag added to the usb_request to decide > > > > whether or not to do so) and then calling it manually later once the data > > > > phase was validated by the gadget driver (or indeed userspace). A very > > > > barebones version of my attempt to do that looks like this: > > > > > > > We shouldn't do this. At the protocol level, there must be better ways > > > to do handshake than relying on protocol STALL _after_ the data stage. > > > Note that not all controllers support this. > > > > > > Maybe I'm misunderstanding, but isn't this how the USB spec expects it to > > work? Reading "Reporting Status Results (8.5.3.1)" in the USB 2.0 spec for > > the status stage in a control write it says "The function responds with > > either a handshake or a zero-length data packet to indicate its current > > status", and the handshake can be either STALL or NAK. If we can't do this, > > how else can we indicate to the host that the data sent during a control out > > transfer is in some way invalid? > > > > My concern is from the documentation note[*] added from this commit: > 579c2b46f74 ("USB Gadget: documentation update") When the gadget subsystem was originally designed, it made no allowance for sending a STALL in the status stage. The UDC drivers existing at that time would automatically send their own zero-length status packet when the control data was received. Drivers written since then have copied that approach. They had to, if they wanted to work with the existing gadget drivers. So the end result is that fully supporting status stalls will require changing pretty much every UDC driver. As for whether the UDC hardware has support... I don't know. Some of the earlier devices might not, but I expect that the more popular recent designs would provide a way to do it. Alan Stern
On Thu, Jan 26, 2023, Alan Stern wrote: > On Thu, Jan 26, 2023 at 07:31:34PM +0000, Thinh Nguyen wrote: > > On Thu, Jan 26, 2023, Dan Scally wrote: > > > Hi Thinh > > > > > > On 26/01/2023 00:20, Thinh Nguyen wrote: > > > > On Tue, Jan 24, 2023, Dan Scally wrote: > > > > > Hi Thinh > > > > > > > > > > > > > > > I'm trying to update the DWC3 driver to allow the status phase of a > > > > > transaction to be explicit; meaning the gadget driver has the choice to > > > > > either Ack or Stall _after_ the data phase so that the contents of the data > > > > > phase can be validated. I thought that I should be able to achieve this by > > > > > preventing dwc3_ep0_xfernotready() from calling dwc3_ep0_do_control_status() > > > > > (relying on an "explicit_status" flag added to the usb_request to decide > > > > > whether or not to do so) and then calling it manually later once the data > > > > > phase was validated by the gadget driver (or indeed userspace). A very > > > > > barebones version of my attempt to do that looks like this: > > > > > > > > > We shouldn't do this. At the protocol level, there must be better ways > > > > to do handshake than relying on protocol STALL _after_ the data stage. > > > > Note that not all controllers support this. > > > > > > > > > Maybe I'm misunderstanding, but isn't this how the USB spec expects it to > > > work? Reading "Reporting Status Results (8.5.3.1)" in the USB 2.0 spec for > > > the status stage in a control write it says "The function responds with > > > either a handshake or a zero-length data packet to indicate its current > > > status", and the handshake can be either STALL or NAK. If we can't do this, > > > how else can we indicate to the host that the data sent during a control out > > > transfer is in some way invalid? > > > > > > > My concern is from the documentation note[*] added from this commit: > > 579c2b46f74 ("USB Gadget: documentation update") > > When the gadget subsystem was originally designed, it made no allowance > for sending a STALL in the status stage. The UDC drivers existing at > that time would automatically send their own zero-length status packet > when the control data was received. > > Drivers written since then have copied that approach. They had to, if > they wanted to work with the existing gadget drivers. So the end result > is that fully supporting status stalls will require changing pretty much > every UDC driver. > > As for whether the UDC hardware has support... I don't know. Some of > the earlier devices might not, but I expect that the more popular recent > designs would provide a way to do it. > Right, it's just a bit concerning when the document also noted this: "Note that some USB device controllers disallow protocol stall responses in some cases." It could be just for older controllers as you mentioned. Hi Dan, We should already have this mechanism in place to do protocol STALL. Please look into delayed_status and set halt. Regarding this question: How else can we indicate to the host that the data sent during a control out transfer is in some way invalid? Typically there should be another request checking for the command status. I suppose if we use protocol STALL, you only need to send status request check on error cases. Thanks, Thinh
(+CC roger as the author of the USB_GADGET_DELAYED_STATUS mechanism) On 26/01/2023 23:57, Thinh Nguyen wrote: > On Thu, Jan 26, 2023, Alan Stern wrote: >> On Thu, Jan 26, 2023 at 07:31:34PM +0000, Thinh Nguyen wrote: >>> On Thu, Jan 26, 2023, Dan Scally wrote: >>>> Hi Thinh >>>> >>>> On 26/01/2023 00:20, Thinh Nguyen wrote: >>>>> On Tue, Jan 24, 2023, Dan Scally wrote: >>>>>> Hi Thinh >>>>>> >>>>>> >>>>>> I'm trying to update the DWC3 driver to allow the status phase of a >>>>>> transaction to be explicit; meaning the gadget driver has the choice to >>>>>> either Ack or Stall _after_ the data phase so that the contents of the data >>>>>> phase can be validated. I thought that I should be able to achieve this by >>>>>> preventing dwc3_ep0_xfernotready() from calling dwc3_ep0_do_control_status() >>>>>> (relying on an "explicit_status" flag added to the usb_request to decide >>>>>> whether or not to do so) and then calling it manually later once the data >>>>>> phase was validated by the gadget driver (or indeed userspace). A very >>>>>> barebones version of my attempt to do that looks like this: >>>>>> >>>>> We shouldn't do this. At the protocol level, there must be better ways >>>>> to do handshake than relying on protocol STALL _after_ the data stage. >>>>> Note that not all controllers support this. >>>> >>>> Maybe I'm misunderstanding, but isn't this how the USB spec expects it to >>>> work? Reading "Reporting Status Results (8.5.3.1)" in the USB 2.0 spec for >>>> the status stage in a control write it says "The function responds with >>>> either a handshake or a zero-length data packet to indicate its current >>>> status", and the handshake can be either STALL or NAK. If we can't do this, >>>> how else can we indicate to the host that the data sent during a control out >>>> transfer is in some way invalid? >>>> >>> My concern is from the documentation note[*] added from this commit: >>> 579c2b46f74 ("USB Gadget: documentation update") >> When the gadget subsystem was originally designed, it made no allowance >> for sending a STALL in the status stage. The UDC drivers existing at >> that time would automatically send their own zero-length status packet >> when the control data was received. >> >> Drivers written since then have copied that approach. They had to, if >> they wanted to work with the existing gadget drivers. So the end result >> is that fully supporting status stalls will require changing pretty much >> every UDC driver. >> >> As for whether the UDC hardware has support... I don't know. Some of >> the earlier devices might not, but I expect that the more popular recent >> designs would provide a way to do it. >> > Right, it's just a bit concerning when the document also noted this: > "Note that some USB device controllers disallow protocol stall responses > in some cases." > > It could be just for older controllers as you mentioned. > > > Hi Dan, > > We should already have this mechanism in place to do protocol STALL. > Please look into delayed_status and set halt. Thanks; I tried this by returning USB_GADGET_DELAYED_STATUS from the function's .setup() callback and later (after userspace checks the data packet) either calling usb_ep_queue() or usb_ep_set_halt() and it does seem to be working. This surprises me, as my understanding was that the purpose of USB_GADGET_DELAYED_STATUS is to pause all control transfers including the data phase to give the function driver enough time to queue a request (and possibly only for specific requests). Regardless though I think the conclusion from previous discussions on this topic (see [1] for example) was that we don't want to rely on USB_GADGET_DELAYED_STATUS to do this which is why I had avoided it in the first place. A colleague made a series [2] some time ago that adds a flag to usb_request which function drivers can set when queuing the data phase request. UDC drivers then read that flag to decide whether to delay the status phase until after another usb_ep_queue(), and that's what I'm trying to implement here. [1] https://lkml.org/lkml/2018/10/10/138 [2] https://patchwork.kernel.org/project/linux-usb/patch/20190124030228.19840-5-paul.elder@ideasonboard.com/ > > Regarding this question: > How else can we indicate to the host that the data sent during a > control out transfer is in some way invalid? > > Typically there should be another request checking for the command > status. I suppose if we use protocol STALL, you only need to send status > request check on error cases. > > Thanks, > Thinh
Hi, On 02/02/2023 12:12, Dan Scally wrote: > (+CC roger as the author of the USB_GADGET_DELAYED_STATUS mechanism) > > On 26/01/2023 23:57, Thinh Nguyen wrote: >> On Thu, Jan 26, 2023, Alan Stern wrote: >>> On Thu, Jan 26, 2023 at 07:31:34PM +0000, Thinh Nguyen wrote: >>>> On Thu, Jan 26, 2023, Dan Scally wrote: >>>>> Hi Thinh >>>>> >>>>> On 26/01/2023 00:20, Thinh Nguyen wrote: >>>>>> On Tue, Jan 24, 2023, Dan Scally wrote: >>>>>>> Hi Thinh >>>>>>> >>>>>>> >>>>>>> I'm trying to update the DWC3 driver to allow the status phase of a >>>>>>> transaction to be explicit; meaning the gadget driver has the choice to >>>>>>> either Ack or Stall _after_ the data phase so that the contents of the data >>>>>>> phase can be validated. I thought that I should be able to achieve this by >>>>>>> preventing dwc3_ep0_xfernotready() from calling dwc3_ep0_do_control_status() >>>>>>> (relying on an "explicit_status" flag added to the usb_request to decide >>>>>>> whether or not to do so) and then calling it manually later once the data >>>>>>> phase was validated by the gadget driver (or indeed userspace). A very >>>>>>> barebones version of my attempt to do that looks like this: >>>>>>> >>>>>> We shouldn't do this. At the protocol level, there must be better ways >>>>>> to do handshake than relying on protocol STALL _after_ the data stage. >>>>>> Note that not all controllers support this. >>>>> >>>>> Maybe I'm misunderstanding, but isn't this how the USB spec expects it to >>>>> work? Reading "Reporting Status Results (8.5.3.1)" in the USB 2.0 spec for >>>>> the status stage in a control write it says "The function responds with >>>>> either a handshake or a zero-length data packet to indicate its current >>>>> status", and the handshake can be either STALL or NAK. If we can't do this, >>>>> how else can we indicate to the host that the data sent during a control out >>>>> transfer is in some way invalid? >>>>> >>>> My concern is from the documentation note[*] added from this commit: >>>> 579c2b46f74 ("USB Gadget: documentation update") >>> When the gadget subsystem was originally designed, it made no allowance >>> for sending a STALL in the status stage. The UDC drivers existing at >>> that time would automatically send their own zero-length status packet >>> when the control data was received. >>> >>> Drivers written since then have copied that approach. They had to, if >>> they wanted to work with the existing gadget drivers. So the end result >>> is that fully supporting status stalls will require changing pretty much >>> every UDC driver. >>> >>> As for whether the UDC hardware has support... I don't know. Some of >>> the earlier devices might not, but I expect that the more popular recent >>> designs would provide a way to do it. >>> >> Right, it's just a bit concerning when the document also noted this: >> "Note that some USB device controllers disallow protocol stall responses >> in some cases." >> >> It could be just for older controllers as you mentioned. >> >> >> Hi Dan, >> >> We should already have this mechanism in place to do protocol STALL. >> Please look into delayed_status and set halt. > > > Thanks; I tried this by returning USB_GADGET_DELAYED_STATUS from the function's .setup() callback and later (after userspace checks the data packet) either calling usb_ep_queue() or usb_ep_set_halt() and it does seem to be working. This surprises me, as my understanding was that the purpose of USB_GADGET_DELAYED_STATUS is to pause all control transfers including the data phase to give the function driver enough time to queue a request (and possibly only for specific requests). Regardless though I think the conclusion from previous discussions on this topic (see [1] for example) was that we don't want to rely on USB_GADGET_DELAYED_STATUS to do this which is why I had avoided it in the first place. A colleague made a series [2] some time ago that adds a flag to usb_request which function drivers can set when queuing the data phase request. UDC drivers then read that flag to decide whether to delay the status phase until after another usb_ep_queue(), and that's what I'm trying > to implement here. To give you some background on USB_GADGET_DELAYED_STATUS. As per Mass storage bulk-only spec [3] Section 3.1, "The device shall NAK the status stage of the device request until the Bulk-Only Mass Storage Reset is complete." So USB_GADGET_DELAYED_STATUS was introduced. Note: wLength field set to 0 in the mass storage control request. USB_GADGET_DELAYED_STATUS feature was limited only for this specific case. As there is no data phase in the control request, the host is simply waiting for an ACK packet when Reset operation is complete. Without USB_GADGET_DELAYED_STATUS the mass storage function would fail the USBCV mass storage compliance test at that time. [3] https://www.usb.org/sites/default/files/usbmassbulk_10.pdf > > > [1] https://lkml.org/lkml/2018/10/10/138 > > [2] https://patchwork.kernel.org/project/linux-usb/patch/20190124030228.19840-5-paul.elder@ideasonboard.com/ > >> >> Regarding this question: >> How else can we indicate to the host that the data sent during a >> control out transfer is in some way invalid? >> >> Typically there should be another request checking for the command >> status. I suppose if we use protocol STALL, you only need to send status >> request check on error cases. >> >> Thanks, >> Thinh cheers, -roger
On Thu, Feb 02, 2023 at 10:12:45AM +0000, Dan Scally wrote: > (+CC roger as the author of the USB_GADGET_DELAYED_STATUS mechanism) > > On 26/01/2023 23:57, Thinh Nguyen wrote: > > We should already have this mechanism in place to do protocol STALL. > > Please look into delayed_status and set halt. > > > Thanks; I tried this by returning USB_GADGET_DELAYED_STATUS from the > function's .setup() callback and later (after userspace checks the data > packet) either calling usb_ep_queue() or usb_ep_set_halt() and it does seem > to be working. This surprises me, as my understanding was that the purpose > of USB_GADGET_DELAYED_STATUS is to pause all control transfers including > the data phase to give the function driver enough time to queue a request > (and possibly only for specific requests). Regardless though I think the > conclusion from previous discussions on this topic (see [1] for example) was > that we don't want to rely on USB_GADGET_DELAYED_STATUS to do this which is > why I had avoided it in the first place. A colleague made a series [2] some > time ago that adds a flag to usb_request which function drivers can set when > queuing the data phase request. UDC drivers then read that flag to decide > whether to delay the status phase until after another usb_ep_queue(), and > that's what I'm trying to implement here. > > > [1] https://lkml.org/lkml/2018/10/10/138 > > [2] https://patchwork.kernel.org/project/linux-usb/patch/20190124030228.19840-5-paul.elder@ideasonboard.com/ I'm in favor of the explicit_status approach from [2]. In fact, there was a whole series of patches impementing this, and I don't think any of them were merged. Keep in mind that there are two separate issues here: Status/data stage for a control-IN or 0-length control-OUT transfer. Status stage for a non-0-length control-OUT transfer. The USB_GADGET_DELAYED_STATUS mechanism was meant to help with the first, not the second. explicit_status was meant to help with the second; it may be able to help with both. Alan Stern
On 02/02/2023 14:52, Alan Stern wrote: > On Thu, Feb 02, 2023 at 10:12:45AM +0000, Dan Scally wrote: >> (+CC roger as the author of the USB_GADGET_DELAYED_STATUS mechanism) >> >> On 26/01/2023 23:57, Thinh Nguyen wrote: >>> We should already have this mechanism in place to do protocol STALL. >>> Please look into delayed_status and set halt. >> >> Thanks; I tried this by returning USB_GADGET_DELAYED_STATUS from the >> function's .setup() callback and later (after userspace checks the data >> packet) either calling usb_ep_queue() or usb_ep_set_halt() and it does seem >> to be working. This surprises me, as my understanding was that the purpose >> of USB_GADGET_DELAYED_STATUS is to pause all control transfers including >> the data phase to give the function driver enough time to queue a request >> (and possibly only for specific requests). Regardless though I think the >> conclusion from previous discussions on this topic (see [1] for example) was >> that we don't want to rely on USB_GADGET_DELAYED_STATUS to do this which is >> why I had avoided it in the first place. A colleague made a series [2] some >> time ago that adds a flag to usb_request which function drivers can set when >> queuing the data phase request. UDC drivers then read that flag to decide >> whether to delay the status phase until after another usb_ep_queue(), and >> that's what I'm trying to implement here. >> >> >> [1] https://lkml.org/lkml/2018/10/10/138 >> >> [2] https://patchwork.kernel.org/project/linux-usb/patch/20190124030228.19840-5-paul.elder@ideasonboard.com/ > I'm in favor of the explicit_status approach from [2]. In fact, there > was a whole series of patches impementing this, and I don't think any of > them were merged. Yep, I'm picking that series up and want to get it merged. > Keep in mind that there are two separate issues here: > > Status/data stage for a control-IN or 0-length control-OUT > transfer. > > Status stage for a non-0-length control-OUT transfer. > > The USB_GADGET_DELAYED_STATUS mechanism was meant to help with the > first, not the second. explicit_status was meant to help with the > second; it may be able to help with both. Ack - thanks. That thread I linked was very informative, I wish I'd found it sooner! > Alan Stern
On Thu, Feb 02, 2023 at 03:45:24PM +0000, Dan Scally wrote: > > On 02/02/2023 14:52, Alan Stern wrote: > > On Thu, Feb 02, 2023 at 10:12:45AM +0000, Dan Scally wrote: > > > (+CC roger as the author of the USB_GADGET_DELAYED_STATUS mechanism) > > > > > > On 26/01/2023 23:57, Thinh Nguyen wrote: > > > > We should already have this mechanism in place to do protocol STALL. > > > > Please look into delayed_status and set halt. > > > > > > Thanks; I tried this by returning USB_GADGET_DELAYED_STATUS from the > > > function's .setup() callback and later (after userspace checks the data > > > packet) either calling usb_ep_queue() or usb_ep_set_halt() and it does seem > > > to be working. This surprises me, as my understanding was that the purpose > > > of USB_GADGET_DELAYED_STATUS is to pause all control transfers including > > > the data phase to give the function driver enough time to queue a request > > > (and possibly only for specific requests). Regardless though I think the > > > conclusion from previous discussions on this topic (see [1] for example) was > > > that we don't want to rely on USB_GADGET_DELAYED_STATUS to do this which is > > > why I had avoided it in the first place. A colleague made a series [2] some > > > time ago that adds a flag to usb_request which function drivers can set when > > > queuing the data phase request. UDC drivers then read that flag to decide > > > whether to delay the status phase until after another usb_ep_queue(), and > > > that's what I'm trying to implement here. > > > > > > > > > [1] https://lkml.org/lkml/2018/10/10/138 > > > > > > [2] https://patchwork.kernel.org/project/linux-usb/patch/20190124030228.19840-5-paul.elder@ideasonboard.com/ > > I'm in favor of the explicit_status approach from [2]. In fact, there > > was a whole series of patches impementing this, and I don't think any of > > them were merged. > > > Yep, I'm picking that series up and want to get it merged. > > > Keep in mind that there are two separate issues here: > > > > Status/data stage for a control-IN or 0-length control-OUT > > transfer. > > > > Status stage for a non-0-length control-OUT transfer. > > > > The USB_GADGET_DELAYED_STATUS mechanism was meant to help with the > > first, not the second. explicit_status was meant to help with the > > second; it may be able to help with both. > > Ack - thanks. That thread I linked was very informative, I wish I'd found it > sooner! There is still a race in the gadget layer's handling of control requests. The host can send a SETUP packet at any time. So when a function driver queues a usb_request for ep0, how does the UDC driver know whether it is in response to the SETUP packet that just now arrived or in response to one that arrived earlier (and is now superseded)? This race exists even at the hardware level, and I'm pretty sure that a lot of UDC controllers don't handle it properly. But there's nothing we can do about that... My thought (and this goes back almost 20 years!) was that a UDC driver should associate a different tag value with each incoming SETUP packet. This tag would get passed to the function driver in its ->setup() callback, and the function driver would copy the value into a new .control_tag field of the usb_request structure it queues as part of the control transfer. Then the UDC driver could inspect the control_tag value when it is asked to queue a request for ep0, and it could return failure if the value doesn't match the UDC's current tag. This can be done while holding the UDC's spinlock, so it will be free of races. The right way to do this would be to add a new argument to the ->setup() callback, for the tag value. But this would mean changing the gadget API, and it would require simultaneously updating every UDC driver and every gadget/function driver. Alternatively, there could be a .current_tag field added to the usb_gadget structure, which is also passed to ->setup(). It would be more awkward, but drivers not converted to the new mechanism would simply leave the field permanently set to 0. Provided all genuine tags are nonzero, the mechanism would be backward compatible with existing code. Of course, this is all independent of the explicit_status changes. Alan Stern
On Thu, Feb 02, 2023, Alan Stern wrote: > On Thu, Feb 02, 2023 at 03:45:24PM +0000, Dan Scally wrote: > > > > On 02/02/2023 14:52, Alan Stern wrote: > > > On Thu, Feb 02, 2023 at 10:12:45AM +0000, Dan Scally wrote: > > > > (+CC roger as the author of the USB_GADGET_DELAYED_STATUS mechanism) > > > > > > > > On 26/01/2023 23:57, Thinh Nguyen wrote: > > > > > We should already have this mechanism in place to do protocol STALL. > > > > > Please look into delayed_status and set halt. > > > > > > > > Thanks; I tried this by returning USB_GADGET_DELAYED_STATUS from the > > > > function's .setup() callback and later (after userspace checks the data > > > > packet) either calling usb_ep_queue() or usb_ep_set_halt() and it does seem > > > > to be working. This surprises me, as my understanding was that the purpose > > > > of USB_GADGET_DELAYED_STATUS is to pause all control transfers including > > > > the data phase to give the function driver enough time to queue a request > > > > (and possibly only for specific requests). Regardless though I think the > > > > conclusion from previous discussions on this topic (see [1] for example) was > > > > that we don't want to rely on USB_GADGET_DELAYED_STATUS to do this which is > > > > why I had avoided it in the first place. A colleague made a series [2] some > > > > time ago that adds a flag to usb_request which function drivers can set when > > > > queuing the data phase request. UDC drivers then read that flag to decide > > > > whether to delay the status phase until after another usb_ep_queue(), and > > > > that's what I'm trying to implement here. > > > > > > > > > > > > [1] https://urldefense.com/v3/__https://lkml.org/lkml/2018/10/10/138__;!!A4F2R9G_pg!egC57Exy2Wsf5lRzlULu6D3E3fc8Svgx5TnFsmB3Em1KX7OoaEnmD6dW_l8_G4pzybrPYhDqXfZ6f7XoEZVXKqUg4k5v$ > > > > > > > > [2] https://urldefense.com/v3/__https://patchwork.kernel.org/project/linux-usb/patch/20190124030228.19840-5-paul.elder@ideasonboard.com/__;!!A4F2R9G_pg!egC57Exy2Wsf5lRzlULu6D3E3fc8Svgx5TnFsmB3Em1KX7OoaEnmD6dW_l8_G4pzybrPYhDqXfZ6f7XoEZVXKrC-ESSk$ > > > I'm in favor of the explicit_status approach from [2]. In fact, there > > > was a whole series of patches impementing this, and I don't think any of > > > them were merged. > > > > > > Yep, I'm picking that series up and want to get it merged. > > > > > Keep in mind that there are two separate issues here: > > > > > > Status/data stage for a control-IN or 0-length control-OUT > > > transfer. > > > > > > Status stage for a non-0-length control-OUT transfer. > > > > > > The USB_GADGET_DELAYED_STATUS mechanism was meant to help with the > > > first, not the second. explicit_status was meant to help with the > > > second; it may be able to help with both. While we may not have a convenient function to do the status completion, the gadget driver can always use the same mechanism for delayed status and explicitly queue the status stage on the OUT data completion. The dwc3 driver should already be able to handle that. However, it's better to have a convenient function for that, and probably remove any warning we have in the composite layer. > > > > Ack - thanks. That thread I linked was very informative, I wish I'd found it > > sooner! > > There is still a race in the gadget layer's handling of control > requests. The host can send a SETUP packet at any time. So when a > function driver queues a usb_request for ep0, how does the UDC driver > know whether it is in response to the SETUP packet that just now arrived > or in response to one that arrived earlier (and is now superseded)? Different stages of different the control transfers shouldn't interleave. If a new SETUP packet is received before completing the previous control transfer, the previous control transfer is canceled. Control transfer doesn't have something like bulk stream-id to allow for interleaving. The gadget driver should handle the different control transfers synchronously. > > This race exists even at the hardware level, and I'm pretty sure that a > lot of UDC controllers don't handle it properly. But there's nothing we > can do about that... I can't speak for other controllers, but this is for dwc3 controllers: If the host sends a new SETUP packet without finishing with the previous control transfer data/status, the data/status TRB would be completed with "SETUP_PENDING" status. This indicates that the host canceled the previous control transfer and the UDC driver needs to setup a SETUP TRB to service the new SETUP packet. The previous control transfer would be returned with a canceled status. BR, Thinh > > My thought (and this goes back almost 20 years!) was that a UDC driver > should associate a different tag value with each incoming SETUP packet. > This tag would get passed to the function driver in its ->setup() > callback, and the function driver would copy the value into a new > .control_tag field of the usb_request structure it queues as part of the > control transfer. > > Then the UDC driver could inspect the control_tag value when it is asked > to queue a request for ep0, and it could return failure if the value > doesn't match the UDC's current tag. This can be done while holding the > UDC's spinlock, so it will be free of races. > > The right way to do this would be to add a new argument to the ->setup() > callback, for the tag value. But this would mean changing the gadget > API, and it would require simultaneously updating every UDC driver and > every gadget/function driver. > > Alternatively, there could be a .current_tag field added to the > usb_gadget structure, which is also passed to ->setup(). It would be > more awkward, but drivers not converted to the new mechanism would > simply leave the field permanently set to 0. Provided all genuine tags > are nonzero, the mechanism would be backward compatible with existing > code. > > Of course, this is all independent of the explicit_status changes. > > Alan Stern
On Thu, Feb 02, 2023, Dan Scally wrote: > (+CC roger as the author of the USB_GADGET_DELAYED_STATUS mechanism) > > On 26/01/2023 23:57, Thinh Nguyen wrote: > > On Thu, Jan 26, 2023, Alan Stern wrote: > > > On Thu, Jan 26, 2023 at 07:31:34PM +0000, Thinh Nguyen wrote: > > > > On Thu, Jan 26, 2023, Dan Scally wrote: > > > > > Hi Thinh > > > > > > > > > > On 26/01/2023 00:20, Thinh Nguyen wrote: > > > > > > On Tue, Jan 24, 2023, Dan Scally wrote: > > > > > > > Hi Thinh > > > > > > > > > > > > > > > > > > > > > I'm trying to update the DWC3 driver to allow the status phase of a > > > > > > > transaction to be explicit; meaning the gadget driver has the choice to > > > > > > > either Ack or Stall _after_ the data phase so that the contents of the data > > > > > > > phase can be validated. I thought that I should be able to achieve this by > > > > > > > preventing dwc3_ep0_xfernotready() from calling dwc3_ep0_do_control_status() > > > > > > > (relying on an "explicit_status" flag added to the usb_request to decide > > > > > > > whether or not to do so) and then calling it manually later once the data > > > > > > > phase was validated by the gadget driver (or indeed userspace). A very > > > > > > > barebones version of my attempt to do that looks like this: > > > > > > > > > > > > > We shouldn't do this. At the protocol level, there must be better ways > > > > > > to do handshake than relying on protocol STALL _after_ the data stage. > > > > > > Note that not all controllers support this. > > > > > > > > > > Maybe I'm misunderstanding, but isn't this how the USB spec expects it to > > > > > work? Reading "Reporting Status Results (8.5.3.1)" in the USB 2.0 spec for > > > > > the status stage in a control write it says "The function responds with > > > > > either a handshake or a zero-length data packet to indicate its current > > > > > status", and the handshake can be either STALL or NAK. If we can't do this, > > > > > how else can we indicate to the host that the data sent during a control out > > > > > transfer is in some way invalid? > > > > > > > > > My concern is from the documentation note[*] added from this commit: > > > > 579c2b46f74 ("USB Gadget: documentation update") > > > When the gadget subsystem was originally designed, it made no allowance > > > for sending a STALL in the status stage. The UDC drivers existing at > > > that time would automatically send their own zero-length status packet > > > when the control data was received. > > > > > > Drivers written since then have copied that approach. They had to, if > > > they wanted to work with the existing gadget drivers. So the end result > > > is that fully supporting status stalls will require changing pretty much > > > every UDC driver. > > > > > > As for whether the UDC hardware has support... I don't know. Some of > > > the earlier devices might not, but I expect that the more popular recent > > > designs would provide a way to do it. > > > > > Right, it's just a bit concerning when the document also noted this: > > "Note that some USB device controllers disallow protocol stall responses > > in some cases." > > > > It could be just for older controllers as you mentioned. > > > > > > Hi Dan, > > > > We should already have this mechanism in place to do protocol STALL. > > Please look into delayed_status and set halt. > > > Thanks; I tried this by returning USB_GADGET_DELAYED_STATUS from the > function's .setup() callback and later (after userspace checks the data > packet) either calling usb_ep_queue() or usb_ep_set_halt() and it does seem > to be working. This surprises me, as my understanding was that the purpose > of USB_GADGET_DELAYED_STATUS is to pause all control transfers including > the data phase to give the function driver enough time to queue a request > (and possibly only for specific requests). Regardless though I think the > conclusion from previous discussions on this topic (see [1] for example) was > that we don't want to rely on USB_GADGET_DELAYED_STATUS to do this which is My comment initially was more for handling from the host and how it should behave. If the UVC spec requires this, then we should implement it. Since you only handle the device side, we have no control how the host would behave. I probably shouldn't have brought it up at all as it may cause more confusion. Thanks, Thinh > why I had avoided it in the first place. A colleague made a series [2] some > time ago that adds a flag to usb_request which function drivers can set when > queuing the data phase request. UDC drivers then read that flag to decide > whether to delay the status phase until after another usb_ep_queue(), and > that's what I'm trying to implement here. > > > [1] https://urldefense.com/v3/__https://lkml.org/lkml/2018/10/10/138__;!!A4F2R9G_pg!ZSl3snbG53YKu4tSa2gHu3gsxEjYW43QyGXm1YIR3oRfBqePu1Nxk3aS-cecwoVt4bCqNU6y3ZUEbrH2BScfSck_xq7_$ > > [2] https://urldefense.com/v3/__https://patchwork.kernel.org/project/linux-usb/patch/20190124030228.19840-5-paul.elder@ideasonboard.com/__;!!A4F2R9G_pg!ZSl3snbG53YKu4tSa2gHu3gsxEjYW43QyGXm1YIR3oRfBqePu1Nxk3aS-cecwoVt4bCqNU6y3ZUEbrH2BScfSacsmPOj$ > > > > > Regarding this question: > > How else can we indicate to the host that the data sent during a > > control out transfer is in some way invalid? > > > > Typically there should be another request checking for the command > > status. I suppose if we use protocol STALL, you only need to send status > > request check on error cases. > > > > Thanks, > > Thinh
On Thu, Feb 02, 2023 at 07:48:29PM +0000, Thinh Nguyen wrote: > On Thu, Feb 02, 2023, Alan Stern wrote: > > > > Keep in mind that there are two separate issues here: > > > > > > > > Status/data stage for a control-IN or 0-length control-OUT > > > > transfer. > > > > > > > > Status stage for a non-0-length control-OUT transfer. > > > > > > > > The USB_GADGET_DELAYED_STATUS mechanism was meant to help with the > > > > first, not the second. explicit_status was meant to help with the > > > > second; it may be able to help with both. > > While we may not have a convenient function to do the status completion, > the gadget driver can always use the same mechanism for delayed status > and explicitly queue the status stage on the OUT data completion. The > dwc3 driver should already be able to handle that. However, it's better > to have a convenient function for that, and probably remove any warning > we have in the composite layer. Indeed. The difficulty is that gadget drivers have to work with all UDC drivers, not just with dwc3, and the others may not support explicit queuing of status transactions. The explicit_status mechanism was designed to make this work correctly in all cases (or, at least as correctly as it works now). > > > Ack - thanks. That thread I linked was very informative, I wish I'd found it > > > sooner! > > > > There is still a race in the gadget layer's handling of control > > requests. The host can send a SETUP packet at any time. So when a > > function driver queues a usb_request for ep0, how does the UDC driver > > know whether it is in response to the SETUP packet that just now arrived > > or in response to one that arrived earlier (and is now superseded)? > > Different stages of different the control transfers shouldn't > interleave. If a new SETUP packet is received before completing the > previous control transfer, the previous control transfer is canceled. > Control transfer doesn't have something like bulk stream-id to allow for > interleaving. > > The gadget driver should handle the different control transfers > synchronously. Unfortunately gadget drivers cannot always do that, because sometimes the work they need to do in response to a control transfer cannot be carried out in interrupt context. Since ->setup() is called as part of an interrupt handler, the gadget driver may not be able to complete the necessary operations before returning from the callback. Therefore the status stage of a control transfer may have to be handled asynchronously -- which obviously opens up possibilities for races. > > This race exists even at the hardware level, and I'm pretty sure that a > > lot of UDC controllers don't handle it properly. But there's nothing we > > can do about that... > > I can't speak for other controllers, but this is for dwc3 controllers: > > If the host sends a new SETUP packet without finishing with the previous > control transfer data/status, the data/status TRB would be completed > with "SETUP_PENDING" status. This indicates that the host canceled the > previous control transfer and the UDC driver needs to setup a SETUP TRB > to service the new SETUP packet. The previous control transfer would be > returned with a canceled status. Hans Selasky mentioned in a recent email that the only controllers he trusts to get this right are the ones that use a "transaction log" sort of approach, like xHCI does. I'm not claiming that controllers using other approaches will always be wrong, but this does explain why dwc3 is able to manage control transfers correctly. Alan Stern
On 2/2/23 21:15, Alan Stern wrote: > Hans Selasky mentioned in a recent email that the only controllers he > trusts to get this right are the ones that use a "transaction log" sort > of approach, like xHCI does. I'm not claiming that controllers using > other approaches will always be wrong, but this does explain why dwc3 is > able to manage control transfers correctly. Yes, I confirm my stand on this topic. A good USB controller lays out a linked schedule of DMA commands, like this is the expected sequence of operation, for both host- and device- side. That's my conclusion so far. I know on the USB device side you typically always have to ACK the received SETUP packet, but besides from that it has helped me a lot to follow this principle, when designing various Host/Device side drivers, mostly for FreeBSD. For the handful of device-side USB chips I've added support for in FreeBSD, ranging from ATMEL (ARM/AVR32) to STM32F, Raspberry PI and a few others, not one single of them described all error cases in their data sheets for the USB control endpoint. In the beginning everything looks fine. You have a 32-bit status register, one bit for SETUP packet received, one bit for DATA, one bit for STALL and then a corresponding 32-bit interrupt mask register. All USB chip vendors I've seen so far do it exactly the same. Just different names on the bits. As long as the CPU on the USB device side is fast enough to handle all the events one by one, it's all good and sound. But at the moment multiple bits are set in the status register, like both SETUP and DATA received at the same time, then stuff starts to get difficult. Small signal driven operating systems combined with fiddling interrupt masks, is the way to hell, in my opinion. When you are pushing signal based OS'es, at some point there is typically a growing mismatch between generated IRQ signals, and consumed ones, and the system runs out of signal memory and dies. And you may laugh and think, that's easy to fix. Just wait for the signals to drain, and then you enable the interrupts again. There is only one problem, you need to trigger the issue using a Microsoft Windows running USB host, and a .exe file, else it doesn't count. As simple as that. Many professional USB companies out there, not to mention any names, drive their USB business like that. If you cannot reproduce it on Windows, then it is not an issue. I will not reveal how many times I've driven engineers crazy in the past, using my simple "usbtest" on FreeBSD, with some "hooks" in the USB host controller drivers, to do illegal stuff, so to speak. https://github.com/freebsd/freebsd-src/tree/main/tools/tools/usbtest In more recent times I've been pulled into Thunderbolt and how XHCI controllers are isolated behind DMAR engines, to provide protection towards rough devices. Personally I don't like it. The XHCI controller should be straight on the host computer. And the PCI memory space should not just return -1 when trying to access disconnected devices. I see that PCI express is already packet based, and they should just make an encapsulation for USB straight over PCI express, without the need to move all the logic to the other side. Also the stuff about USB streams in super speed mode I've disabled in FreeBSD by default. If you want to do disk stuff, you need a proper PCI based disk controller, as simple as that. The same for network. The way USB is designed, for example the BULK transport, basically forces you to memcpy() IP-packets into a bigger buffer, which is then moved in wMaxpacketSize chunks across the USB wires, typically like NCM. In FreeBSD there is a multi-packet feature, which just take the DMA pointer of all those packets, and link up a huge TD list, and then bang. But oh-no. Short terminated packets take just as long time as fully sized packets to transfer. Remember the ACK for every DATA? When already by design a protocol will lean on long and continuous data transfers, USB is no substitute for a ConnectX-4 and newer network card. And to all Intel engineers trying to facilitate that, by adding more and more features to USB: Please stop now! 10 Gbit/s is enough for USB in my opinion. If you go above that, rather use infiniband, which is already integrated with existing storage solutions. Rather than inventing yet another storage protocol. There are so many things going on down at the hardware level to get 100 GBit/s flowing without hickup, that I don't want to see any of that in the USB world. Todays rant about the state of the USB world :-) --HPS
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index 81c486b3941c..c35436f3d3c3 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -788,6 +788,7 @@ enum dwc3_ep0_state { EP0_SETUP_PHASE, EP0_DATA_PHASE, EP0_STATUS_PHASE, + EP0_AWAITING_EXPLICIT_STATUS, }; enum dwc3_link_state { diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c index 5d642660fd15..692a99debcd9 100644 --- a/drivers/usb/dwc3/ep0.c +++ b/drivers/usb/dwc3/ep0.c @@ -894,10 +894,14 @@ static void dwc3_ep0_complete_data(struct dwc3 *dwc, dwc->ep0_bounced = false; } - if ((epnum & 1) && ur->actual < ur->length) + if ((epnum & 1) && ur->actual < ur->length) { dwc3_ep0_stall_and_restart(dwc); - else + } else { + if (r->request.explicit_status) + dwc->ep0state = EP0_AWAITING_EXPLICIT_STATUS; + dwc3_gadget_giveback(ep0, r, 0); + } } static void dwc3_ep0_complete_status(struct dwc3 *dwc, @@ -1111,6 +1115,15 @@ void dwc3_ep0_end_control_data(struct dwc3 *dwc, struct dwc3_ep *dep) dep->resource_index = 0; } +void dwc3_gadget_ep0_control_ack(struct usb_ep *ep) +{ + struct dwc3_ep *dep = to_dwc3_ep(ep); + struct dwc3 *dwc = dep->dwc; + + dwc->ep0state = EP0_STATUS_PHASE; + __dwc3_ep0_do_control_status(dwc, dep); +} + static void dwc3_ep0_xfernotready(struct dwc3 *dwc, const struct dwc3_event_depevt *event) { @@ -1140,6 +1153,14 @@ static void dwc3_ep0_xfernotready(struct dwc3 *dwc, if (dwc->ep0_next_event != DWC3_EP0_NRDY_STATUS) return; + /* + * If the request asked for an explicit status stage hold off + * on sending the status until userspace asks us to. + */ + if (dwc->ep0state == EP0_AWAITING_EXPLICIT_STATUS && + !event->endpoint_number) + return; + dwc->ep0state = EP0_STATUS_PHASE; if (dwc->delayed_status) { diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 0d89dfa6eef5..85044bbbce02 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -2228,6 +2228,7 @@ static const struct usb_ep_ops dwc3_gadget_ep0_ops = { .dequeue = dwc3_gadget_ep_dequeue, .set_halt = dwc3_gadget_ep0_set_halt, .set_wedge = dwc3_gadget_ep_set_wedge, + .control_ack = dwc3_gadget_ep0_control_ack, }; static const struct usb_ep_ops dwc3_gadget_ep_ops = { diff --git a/drivers/usb/dwc3/gadget.h b/drivers/usb/dwc3/gadget.h index 55a56cf67d73..4fc9737b54ca 100644 --- a/drivers/usb/dwc3/gadget.h +++ b/drivers/usb/dwc3/gadget.h @@ -116,6 +116,7 @@ int __dwc3_gadget_ep0_set_halt(struct usb_ep *ep, int value); int dwc3_gadget_ep0_set_halt(struct usb_ep *ep, int value); int dwc3_gadget_ep0_queue(struct usb_ep *ep, struct usb_request *request, gfp_t gfp_flags); +void dwc3_gadget_ep0_control_ack(struct usb_ep *ep); int __dwc3_gadget_ep_set_halt(struct dwc3_ep *dep, int value, int protocol); void dwc3_ep0_send_delayed_status(struct dwc3 *dwc); void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool interrupt); diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h index 3ad58b7a0824..6ee43966eafb 100644 --- a/include/linux/usb/gadget.h +++ b/include/linux/usb/gadget.h @@ -122,6 +122,8 @@ struct usb_request { int status; unsigned actual; + + bool explicit_status; }; /*-------------------------------------------------------------------------*/ @@ -152,6 +154,7 @@ struct usb_ep_ops { int (*fifo_status) (struct usb_ep *ep); void (*fifo_flush) (struct usb_ep *ep); + void (*control_ack) (struct usb_ep *ep); }; /**