Message ID | 20240711084321.44916-1-xiehongyu1@kylinos.cn (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [next] usb: usbfs: Add reset_resume for usbfs | expand |
On 11.07.24 10:43, Hongyu Xie wrote: > During hibernation, usb_resume_interface will set needs_binding to 1 if > the usb_driver has no reset_resume implimentation. The USB interface > will be rebind after usb_resume_complete. > > Normally, that's fine. But if a USB interface has a matched kernel > driver, and a userspace driver or application is using this USB > interface through usbfs during hibernation, usbfs will be > detatched with the USB interface after resume. And this USB interface > will be bind with a kernel driver instead of usbfs. > > So add reset_resume to fix this. The device has lost all settings, yet we continue like nothing changed. That strikes me as a very bad idea. If a device has undergone a reset user space has not requested, we need to return an error upon the next interaction. I am sorry, but this implementation has some fundamental issues. Regards Oliver
On Thu, Jul 11, 2024 at 10:59:56AM +0200, Oliver Neukum wrote: > > > On 11.07.24 10:43, Hongyu Xie wrote: > > During hibernation, usb_resume_interface will set needs_binding to 1 if > > the usb_driver has no reset_resume implimentation. The USB interface > > will be rebind after usb_resume_complete. > > > > Normally, that's fine. But if a USB interface has a matched kernel > > driver, and a userspace driver or application is using this USB > > interface through usbfs during hibernation, usbfs will be > > detatched with the USB interface after resume. And this USB interface > > will be bind with a kernel driver instead of usbfs. > > > > So add reset_resume to fix this. > > The device has lost all settings, yet we continue like nothing > changed. That strikes me as a very bad idea. If a device has undergone > a reset user space has not requested, we need to return an error upon > the next interaction. > > I am sorry, but this implementation has some fundamental issues. Agreed, but the solution is pretty simple. Because the device was suspended, the userspace driver must have enabled suspend via the USBDEVFS_ALLOW_SUSPEND ioctl. At that point, although the driver _could_ try to do some I/O to the device, the safest approach is for the driver to call the USBDEVFS_WAIT_FOR_RESUME or USBDEVFS_FORBID_SUSPEND ioctl. So those two places are where we can return a specific error code to indicate that the device was reset while resuming. Alan Stern
From: Hongyu Xie <xiehongyu1@kylinos.cn> On 2024/7/11 16:59, Oliver Neukum wrote: > > > On 11.07.24 10:43, Hongyu Xie wrote: >> During hibernation, usb_resume_interface will set needs_binding to 1 if >> the usb_driver has no reset_resume implimentation. The USB interface >> will be rebind after usb_resume_complete. >> >> Normally, that's fine. But if a USB interface has a matched kernel >> driver, and a userspace driver or application is using this USB >> interface through usbfs during hibernation, usbfs will be >> detatched with the USB interface after resume. And this USB interface >> will be bind with a kernel driver instead of usbfs. >> >> So add reset_resume to fix this. > > The device has lost all settings, yet we continue like nothing > changed. That strikes me as a very bad idea. If a device has undergone > a reset user space has not requested, we need to return an error upon > the next interaction. Sorry I don't understand your concern. When will "a reset user space has not requested" happen if there is a reset_resume in usbfs? > > I am sorry, but this implementation has some fundamental issues. > > Regards > Oliver > Regards Hongyu Xie
On Fri, Jul 12, 2024 at 11:10:57AM +0800, Hongyu Xie wrote: > From: Hongyu Xie <xiehongyu1@kylinos.cn> > > > > On 2024/7/11 16:59, Oliver Neukum wrote: > > > > > > On 11.07.24 10:43, Hongyu Xie wrote: > > > During hibernation, usb_resume_interface will set needs_binding to 1 if > > > the usb_driver has no reset_resume implimentation. The USB interface > > > will be rebind after usb_resume_complete. > > > > > > Normally, that's fine. But if a USB interface has a matched kernel > > > driver, and a userspace driver or application is using this USB > > > interface through usbfs during hibernation, usbfs will be > > > detatched with the USB interface after resume. And this USB interface > > > will be bind with a kernel driver instead of usbfs. > > > > > > So add reset_resume to fix this. > > > > The device has lost all settings, yet we continue like nothing > > changed. That strikes me as a very bad idea. If a device has undergone > > a reset user space has not requested, we need to return an error upon > > the next interaction. > Sorry I don't understand your concern. > When will "a reset user space has not requested" happen if there is a > reset_resume in usbfs? That's what a reset-resume is: a reset that occurs when the device is resumed, rather than when the driver has requested a reset. Alan Stern > > I am sorry, but this implementation has some fundamental issues. > > > > Regards > > Oliver > > > Regards > Hongyu Xie
From: Hongyu Xie <xiehongyu1@kylinos.cn> On 2024/7/13 10:22, Alan Stern wrote: > On Fri, Jul 12, 2024 at 11:10:57AM +0800, Hongyu Xie wrote: >> From: Hongyu Xie <xiehongyu1@kylinos.cn> >> >> >> >> On 2024/7/11 16:59, Oliver Neukum wrote: >>> >>> >>> On 11.07.24 10:43, Hongyu Xie wrote: >>>> During hibernation, usb_resume_interface will set needs_binding to 1 if >>>> the usb_driver has no reset_resume implimentation. The USB interface >>>> will be rebind after usb_resume_complete. >>>> >>>> Normally, that's fine. But if a USB interface has a matched kernel >>>> driver, and a userspace driver or application is using this USB >>>> interface through usbfs during hibernation, usbfs will be >>>> detatched with the USB interface after resume. And this USB interface >>>> will be bind with a kernel driver instead of usbfs. >>>> >>>> So add reset_resume to fix this. >>> >>> The device has lost all settings, yet we continue like nothing >>> changed. That strikes me as a very bad idea. If a device has undergone >>> a reset user space has not requested, we need to return an error upon >>> the next interaction. >> Sorry I don't understand your concern. >> When will "a reset user space has not requested" happen if there is a >> reset_resume in usbfs? > > That's what a reset-resume is: a reset that occurs when the device is > resumed, rather than when the driver has requested a reset. Right now this reset_resume did nothing, it's just an empty function to prevent rebind after resume. Maybe I should filter out usbfs in usb_resume_interface when setting needs_binding to 1? > > Alan Stern > >>> I am sorry, but this implementation has some fundamental issues. >>> >>> Regards >>> Oliver >>> >> Regards >> Hongyu Xie
On 15.07.24 03:13, Hongyu Xie wrote: > From: Hongyu Xie <xiehongyu1@kylinos.cn> >>> When will "a reset user space has not requested" happen if there is a >>> reset_resume in usbfs? The literal answer to your question is: during resumption of the device or when power was restored to it. Before reset_resume() is called. >> That's what a reset-resume is: a reset that occurs when the device is >> resumed, rather than when the driver has requested a reset. > Right now this reset_resume did nothing, it's just an empty function to prevent rebind after resume. > Maybe I should filter out usbfs in usb_resume_interface when setting needs_binding to 1? Strictly speaking this is true. But it overlooks that at that point something has already been done to the device. Either it has literally been reset by instructing the hub it is connected to to reset a port or by interrupting the power supply. The important point in terms of semantics is that the device state has been lost. And it has been potentially lost within user space knowing about it. There is an inevitable race between a suspend/resume cycle and the next usbfs operation. Therefore After a suspend()/reset_resume() cycle the next usbfs operation must not be executed and an error returned to user space. At present that error is ENODEV. Regards Oliver
On 11.07.24 16:41, Alan Stern wrote: > On Thu, Jul 11, 2024 at 10:59:56AM +0200, Oliver Neukum wrote: >> I am sorry, but this implementation has some fundamental issues. > > Agreed, but the solution is pretty simple. Because the device was > suspended, the userspace driver must have enabled suspend via the > USBDEVFS_ALLOW_SUSPEND ioctl. The whole system could have been suspended, in particularly to S4. Regards Oliver
On Mon, Jul 15, 2024 at 10:53:14AM +0200, Oliver Neukum wrote: > > > On 11.07.24 16:41, Alan Stern wrote: > > On Thu, Jul 11, 2024 at 10:59:56AM +0200, Oliver Neukum wrote: > > > > I am sorry, but this implementation has some fundamental issues. > > > > Agreed, but the solution is pretty simple. Because the device was > > suspended, the userspace driver must have enabled suspend via the > > USBDEVFS_ALLOW_SUSPEND ioctl. > > The whole system could have been suspended, in particularly to S4. You are right. I was thinking of runtime suspend, not system suspend. My mistake. Alan Stern
On 15.07.24 16:22, Alan Stern wrote: > On Mon, Jul 15, 2024 at 10:53:14AM +0200, Oliver Neukum wrote: >> >> >> On 11.07.24 16:41, Alan Stern wrote: >>> Agreed, but the solution is pretty simple. Because the device was >>> suspended, the userspace driver must have enabled suspend via the >>> USBDEVFS_ALLOW_SUSPEND ioctl. >> >> The whole system could have been suspended, in particularly to S4. > > You are right. I was thinking of runtime suspend, not system suspend. > My mistake. This is at the intersection of several scenarios. That is a good part of what makes this difficult. In principal I think we could get away with checking for a flag to be set at reset_resume() before each operation. Elegant this is not. Yet, it seems to me like the race necessarily exists and is unsolvable in user space. Furthermore in the long run, if we want to use D3cold in runtime power management, it looks to me like we would want a second permission ioctl for that. Regards Oliver
From: Hongyu Xie <xiehongyu1@kylinos.cn> On 2024/7/16 20:44, Oliver Neukum wrote: > On 15.07.24 16:22, Alan Stern wrote: >> On Mon, Jul 15, 2024 at 10:53:14AM +0200, Oliver Neukum wrote: >>> >>> >>> On 11.07.24 16:41, Alan Stern wrote: > >>>> Agreed, but the solution is pretty simple. Because the device was >>>> suspended, the userspace driver must have enabled suspend via the >>>> USBDEVFS_ALLOW_SUSPEND ioctl. >>> >>> The whole system could have been suspended, in particularly to S4. >> >> You are right. I was thinking of runtime suspend, not system suspend. >> My mistake. > > This is at the intersection of several scenarios. That is a good part of > what makes this difficult. > In principal I think we could get away with checking for a flag to be set > at reset_resume() before each operation. Elegant this is not. Yet, it seems > to me like the race necessarily exists and is unsolvable in user space. > From what I know, that CONFIG_USB_DEFAULT_PERSIST is enabled by default. Then udev->persist_enabled is set to 1 and this causing udev->reset_resume set to 1 during init2 in hub_activate. During resume, usb_resume_both usb_resume_device generic_resume usb_port_resume finish_port_resume usb_reset_and_verify_device (if udev->reset_resume is 1) hub_port_init hub_port_reset usb_resume_interface (udev->reset_resume is 1 but usbfs doesn't have reset_resume implementation so set intf->needs_binding to 1, and it will be rebind in usb_resume_complete) Even before usbfs->reset_resume is called (if there is one), the USB device has already been reset and in a good state. After all that thaw_processes is called and user-space application runs again. So I still don't understand why "the race necessarily exists". Can you show me an example that usbfs->reset_resume causes race? > Furthermore in the long run, if we want to use D3cold in runtime power > management, it looks to me like we would want a second permission ioctl > for that. > > Regards > Oliver > Regards Hongyu Xie
I'm ignoring most of what you asked Oliver to focus on just one thing: On Wed, Jul 17, 2024 at 09:43:38AM +0800, Hongyu Xie wrote: > Even before usbfs->reset_resume is called (if there is one), the USB device > has already been reset and in a good state. You are wrong to think that being reset means the device is in a good state. The userspace driver may have very carefully put the device into some non-default state with special settings. All those settings will be lost when the device gets reset, and they will have to be reloaded before the device can function properly. But the userspace driver won't even know this has happened unless the kernel tells it somehow. Oliver is pointing out that the kernel has to tell the userspace driver that all the settings have been lost, so the driver will know it needs to load them back into the device. Currently we have no way to send this information to the driver. That's why usbfs doesn't have a reset_resume callback now. Alan Stern
From: Hongyu Xie <xiehongyu1@kylinos.cn> On 2024/7/17 10:05, Alan Stern wrote: > I'm ignoring most of what you asked Oliver to focus on just one thing: > > On Wed, Jul 17, 2024 at 09:43:38AM +0800, Hongyu Xie wrote: >> Even before usbfs->reset_resume is called (if there is one), the USB device >> has already been reset and in a good state. > > You are wrong to think that being reset means the device is in a good > state. > > The userspace driver may have very carefully put the device into some > non-default state with special settings. All those settings will be > lost when the device gets reset, and they will have to be reloaded > before the device can function properly. But the userspace driver won't > even know this has happened unless the kernel tells it somehow. > I was looking the whole thing from kernel's perspective. Thank you for pointing it out for me. > Oliver is pointing out that the kernel has to tell the userspace driver > that all the settings have been lost, so the driver will know it needs > to load them back into the device. Currently we have no way to send > this information to the driver. That's why usbfs doesn't have a > reset_resume callback now. But I still think that there's no need to rebind for a USB device that was using usbfs. Because rebinding doesn't fix settings lost. And it looks strange from user-space's perspective. What do you think? > > Alan Stern
On 17.07.24 03:43, Hongyu Xie wrote: > From: Hongyu Xie <xiehongyu1@kylinos.cn> Hi, sorry for being incomprehensible. I'll try to do better. > From what I know, that CONFIG_USB_DEFAULT_PERSIST is enabled by default. Then udev->persist_enabled is set to 1 and this causing udev->reset_resume set to 1 during init2 in hub_activate. > During resume, > usb_resume_both > usb_resume_device > generic_resume > usb_port_resume > finish_port_resume > usb_reset_and_verify_device (if udev->reset_resume is 1) > hub_port_init > hub_port_reset > usb_resume_interface (udev->reset_resume is 1 but usbfs doesn't have reset_resume implementation so set intf->needs_binding to 1, and it will be rebind in usb_resume_complete) That is correct. But even if CONFIG_USB_DEFAULT_PERSIST were not set, losing power would just lead to reenumeration by another code path. Devices reset themselves when they are power cycled. There is no way around that. > Even before usbfs->reset_resume is called (if there is one), the USB device has already been reset Yes, it has been reset. > and in a good state. No, it is not. Or rather, it is in the wrong state. This is not a question of good and bad. It is a question of being in the same state. > After all that thaw_processes is called and user-space application runs again. Yes. And user space does not know what has happened. > > So I still don't understand why "the race necessarily exists". Can you show me an example that usbfs->reset_resume causes race? Sure. Let's look at the example of a scanner attached to the host. OS Scanning process (in user space) 1. Set a resolution 2. Going to S4 3. Returning to S0 4. Initiate a scan As you can see the system would now scan at the wrong resolution. Step#4 has to fail. As there is no synchronization between S4 and user space, initiating the scan can always be the last step. For this to work, reset_resume() would have to restore the correct resolution. The kernel cannot do so. Regards Oliver
On 17.07.24 05:13, Hongyu Xie wrote: > From: Hongyu Xie <xiehongyu1@kylinos.cn> > But I still think that there's no need to rebind for a USB device that was using usbfs. Technically you are correct. From a conceptual view point the only hard requirement we have is that the first operation after reset_resume() has to fail with an error code user space can interpret. > Because rebinding doesn't fix settings lost. And it looks strange from user-space's perspective. > What do you think? Only user space can reapply the settings. The kernel, however, must notify user space of the need to do so and avoiding a race condition is tricky. However, it is the same race that also applies to a disconnected device and that problem is solved. ENODEV clearly is an error that makes clear to user space that settings have been lost. User space has to be able to deal with a device being disconnected at any time, as we are talking about USB. Hence, where is the need to add a special case for reset_resume()? HTH Oliver
On Wed, Jul 17, 2024 at 11:13:39AM +0800, Hongyu Xie wrote: > From: Hongyu Xie <xiehongyu1@kylinos.cn> > > > > On 2024/7/17 10:05, Alan Stern wrote: > > I'm ignoring most of what you asked Oliver to focus on just one thing: > > > > On Wed, Jul 17, 2024 at 09:43:38AM +0800, Hongyu Xie wrote: > > > Even before usbfs->reset_resume is called (if there is one), the USB device > > > has already been reset and in a good state. > > > > You are wrong to think that being reset means the device is in a good > > state. > > > > The userspace driver may have very carefully put the device into some > > non-default state with special settings. All those settings will be > > lost when the device gets reset, and they will have to be reloaded > > before the device can function properly. But the userspace driver won't > > even know this has happened unless the kernel tells it somehow. > > > I was looking the whole thing from kernel's perspective. Thank you for > pointing it out for me. > > Oliver is pointing out that the kernel has to tell the userspace driver > > that all the settings have been lost, so the driver will know it needs > > to load them back into the device. Currently we have no way to send > > this information to the driver. That's why usbfs doesn't have a > > reset_resume callback now. > But I still think that there's no need to rebind for a USB device that was > using usbfs. Because rebinding doesn't fix settings lost. And it looks > strange from user-space's perspective. > What do you think? That's right, it should be possible to avoid rebinding. But we can't do this until we have some way to tell the userspace driver that a reset has occurred. Oliver's idea is to do this by returning a special error code for the next ioctl, and I can't think of anything better. Alan Stern
On 17.07.24 15:44, Alan Stern wrote: > On Wed, Jul 17, 2024 at 11:13:39AM +0800, Hongyu Xie wrote: >> From: Hongyu Xie <xiehongyu1@kylinos.cn> >> >> >> >> On 2024/7/17 10:05, Alan Stern wrote: > > That's right, it should be possible to avoid rebinding. But we can't do > this until we have some way to tell the userspace driver that a reset > has occurred. Oliver's idea is to do this by returning a special error > code for the next ioctl, and I can't think of anything better. I wish you could, for this is supremely inelegant. In particular it forces user space to reinit the device when it is needed, instead of when it is idle. Yet, I think conceptually the first ioctl absolutely must fail, because the device state has been lost. Regards Oliver
From: Hongyu Xie <xiehongyu1@kylinos.cn> On 2024/7/17 15:42, Oliver Neukum wrote: > On 17.07.24 03:43, Hongyu Xie wrote: >> From: Hongyu Xie <xiehongyu1@kylinos.cn> > > Hi, > > sorry for being incomprehensible. I'll try to do better. > >> From what I know, that CONFIG_USB_DEFAULT_PERSIST is enabled by >> default. Then udev->persist_enabled is set to 1 and this causing >> udev->reset_resume set to 1 during init2 in hub_activate. >> During resume, >> usb_resume_both >> usb_resume_device >> generic_resume >> usb_port_resume >> finish_port_resume >> usb_reset_and_verify_device (if udev->reset_resume is 1) >> hub_port_init >> hub_port_reset >> usb_resume_interface (udev->reset_resume is 1 but usbfs doesn't >> have reset_resume implementation so set intf->needs_binding to 1, and >> it will be rebind in usb_resume_complete) > > That is correct. But even if CONFIG_USB_DEFAULT_PERSIST were not set, > losing power > would just lead to reenumeration by another code path. Devices reset > themselves > when they are power cycled. There is no way around that. > >> Even before usbfs->reset_resume is called (if there is one), the USB >> device has already been reset > > Yes, it has been reset. > >> and in a good state. > > No, it is not. Or rather, it is in the wrong state. This is not a > question of good and bad. > It is a question of being in the same state. After resume, it's in USB_STATE_CONFIGURED state. But here I'm guessing you mean not in a good state from user-space's point of view, right? > >> After all that thaw_processes is called and user-space application >> runs again. > > Yes. And user space does not know what has happened. >> >> So I still don't understand why "the race necessarily exists". Can you >> show me an example that usbfs->reset_resume causes race? > > Sure. Let's look at the example of a scanner attached to the host. > > OS Scanning process (in user space) > 1. Set a resolution > 2. Going to S4 > 3. Returning to S0 > 4. Initiate a scan > > As you can see the system would now scan at the wrong resolution. Step#4 > has to fail. As there is no synchronization between S4 and user space, > initiating > the scan can always be the last step. > For this to work, reset_resume() would have to restore the correct > resolution. The kernel > cannot do so. Now I can see your point. And I think with or without usbfs->reset_resume right now, Step#4 has to fail. > > Regards > Oliver Regards Hongyu Xie
diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c index 3beb6a862e80..8c07df104c9a 100644 --- a/drivers/usb/core/devio.c +++ b/drivers/usb/core/devio.c @@ -746,6 +746,11 @@ static int driver_resume(struct usb_interface *intf) return 0; } +static int driver_reset_resume(struct usb_interface *intf) +{ + return 0; +} + #ifdef CONFIG_PM /* The following routines apply to the entire device, not interfaces */ void usbfs_notify_suspend(struct usb_device *udev) @@ -773,6 +778,7 @@ struct usb_driver usbfs_driver = { .disconnect = driver_disconnect, .suspend = driver_suspend, .resume = driver_resume, + .reset_resume = driver_reset_resume, .supports_autosuspend = 1, };
During hibernation, usb_resume_interface will set needs_binding to 1 if the usb_driver has no reset_resume implimentation. The USB interface will be rebind after usb_resume_complete. Normally, that's fine. But if a USB interface has a matched kernel driver, and a userspace driver or application is using this USB interface through usbfs during hibernation, usbfs will be detatched with the USB interface after resume. And this USB interface will be bind with a kernel driver instead of usbfs. So add reset_resume to fix this. Signed-off-by: Hongyu Xie <xiehongyu1@kylinos.cn> --- drivers/usb/core/devio.c | 6 ++++++ 1 file changed, 6 insertions(+)