Message ID | 20210922125939.427-1-caihuoqing@baidu.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Jiri Kosina |
Headers | show |
Series | HID: wacom: Make use of the helper function devm_add_action_or_reset() | expand |
On Wed, 22 Sep 2021, Cai Huoqing wrote: > The helper function devm_add_action_or_reset() will internally > call devm_add_action(), and if devm_add_action() fails then it will > execute the action mentioned and return the error code. So > use devm_add_action_or_reset() instead of devm_add_action() > to simplify the error handling, reduce the code. > > Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> CCing Jason and Ping to Ack this for the Wacom driver. > --- > drivers/hid/wacom_sys.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c > index 93f49b766376..3aed7ba249f7 100644 > --- a/drivers/hid/wacom_sys.c > +++ b/drivers/hid/wacom_sys.c > @@ -892,10 +892,9 @@ static int wacom_add_shared_data(struct hid_device *hdev) > > wacom_wac->shared = &data->shared; > > - retval = devm_add_action(&hdev->dev, wacom_remove_shared_data, wacom); > + retval = devm_add_action_or_reset(&hdev->dev, wacom_remove_shared_data, wacom); > if (retval) { > mutex_unlock(&wacom_udev_list_lock); > - wacom_remove_shared_data(wacom); > return retval; > } > > -- > 2.25.1 >
On 14 10月 21 10:31:03, Jason Gerecke wrote: > I've attached an RFC patch which shrinks the critical section as I > previously described. This would be applied prior to Cai's patch. Hi Jason, I haved sorted the patches to a series, and fixed the repeated "that" in changelog. like this: https://patchwork.kernel.org/project/linux-input/patch/20211015022803.3827-1-caihuoqing@baidu.com/ If there are any issue to resend v3 or later, feel free to sort my patch as series. You also can attach your patch link directly here. BTW, a minor issue, for 'RFC' prefix, you can use git format-patch --rfc. It should be showed in subject prefix, like "[RFC PATCH ..]" (in the link, I missed fixing it). > > I would appreciate a few more sets of eyes reviewing / testing the change. > > Jason > --- > Now instead of four in the eights place / > you’ve got three, ‘Cause you added one / > (That is to say, eight) to the two, / > But you can’t take seven from three, / > So you look at the sixty-fours.... > > > > On Thu, Oct 7, 2021 at 3:34 PM Cheng, Ping <Ping.Cheng@wacom.com> wrote: > > > I didn’t add mutex_unlock nor work on wacom_remove_shared_data myself. > > Benjamin probably sync’d unlock and Dmitry added shared_data for Wacom > > driver many years ago. Thank you both! > > > > > > > > With that said, I am willing to look into the code and test the patch to > > make sure it doesn’t break anything, which may take a few more days… > > > > > > > > *From:* Jason Gerecke [mailto:killertofu@gmail.com] > > *Sent:* Thursday, October 7, 2021 2:48 PM > > > > > > > > I have not tested this, but it seems like the failure case could trigger a > > deadlock: > > > > > > > > 1. (wacom_sys.c:878): The `wacom_udev_list_lock` mutex is locked > > > > 2. (wacom_sys.c:888): The `data->kref` refcount is initialized to 1 > > > > 3. (wacom_sys.c:893): The `wacom_wac->shared` pointer is set > > > > 4. (wacom_sys.c:895): We call `devm_add_action_or_reset` > > > > 5. Adding the action fails, causing the `devm_add_action_or_reset` to > > immediately call `wacom_remove_shared_data` > > > > 6. (wacom_sys.c:866): The reference count of `data->kref` is decremented, > > triggering a call to `wacom_release_shared_data` > > > > 7. (wacom_sys.c:844): The `wacom_release_shared_data` function blocks on > > the previously-locked `wacom_udev_list_lock` mutex > > > > > > > > I *think* it would be safe to shrink the critical section in > > `wacom_add_shared_data` to end before the call to > > `devm_add_action_or_reset`. It might be possible to push the unlock as far > > back as line 892. That should be sufficient to protect `wacom_udev_list` > > and ensure that we don't accidentally create two "shared" objects when only > > one is desired. I'll defer to Ping since it looks like she added the mutex > > though :) > > > > > > Jason > > > > On Thu, Oct 7, 2021 at 4:39 AM Jiri Kosina <jikos@kernel.org> wrote: > > > > On Wed, 22 Sep 2021, Cai Huoqing wrote: > > > > > The helper function devm_add_action_or_reset() will internally > > > call devm_add_action(), and if devm_add_action() fails then it will > > > execute the action mentioned and return the error code. So > > > use devm_add_action_or_reset() instead of devm_add_action() > > > to simplify the error handling, reduce the code. > > > > > > Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> > > > > CCing Jason and Ping to Ack this for the Wacom driver. > > > > > --- > > > drivers/hid/wacom_sys.c | 3 +-- > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c > > > index 93f49b766376..3aed7ba249f7 100644 > > > --- a/drivers/hid/wacom_sys.c > > > +++ b/drivers/hid/wacom_sys.c > > > @@ -892,10 +892,9 @@ static int wacom_add_shared_data(struct hid_device > > *hdev) > > > > > > wacom_wac->shared = &data->shared; > > > > > > - retval = devm_add_action(&hdev->dev, wacom_remove_shared_data, > > wacom); > > > + retval = devm_add_action_or_reset(&hdev->dev, > > wacom_remove_shared_data, wacom); > > > if (retval) { > > > mutex_unlock(&wacom_udev_list_lock); > > > - wacom_remove_shared_data(wacom); > > > return retval; > > > } > > > > > > -- > > > 2.25.1 > > > > > From 7adc05783c7e3120028d0d089bea224903c24ccd Mon Sep 17 00:00:00 2001 > From: Jason Gerecke <jason.gerecke@wacom.com> > Date: Thu, 14 Oct 2021 07:31:31 -0700 > Subject: [PATCH] RFC: HID: wacom: Shrink critical section in > `wacom_add_shared_data` > > The size of the critical section in this function appears to be larger > than necessary. The `wacom_udev_list_lock` exists to ensure that one > interface cannot begin checking if a shared object exists while a second > interface is doing the same (otherwise both could determine that that no > object exists yet and create their own independent objects rather than > sharing just one). It should be safe for the critical section to end > once a fresly-allocated shared object would be found by other threads > (i.e., once it has been added to `wacom_udev_list`, which is looped > over by `wacom_get_hdev_data`). > > This commit is a necessary pre-requisite for a later change to swap the > use of `devm_add_action` with `devm_add_action_or_reset`, which would > otherwise deadlock in its error case. > > Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> > --- > drivers/hid/wacom_sys.c | 9 ++++----- > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c > index 93f49b766376..62f50e4b837d 100644 > --- a/drivers/hid/wacom_sys.c > +++ b/drivers/hid/wacom_sys.c > @@ -881,8 +881,8 @@ static int wacom_add_shared_data(struct hid_device *hdev) > if (!data) { > data = kzalloc(sizeof(struct wacom_hdev_data), GFP_KERNEL); > if (!data) { > - retval = -ENOMEM; > - goto out; > + mutex_unlock(&wacom_udev_list_lock); > + return -ENOMEM; > } > > kref_init(&data->kref); > @@ -890,11 +890,12 @@ static int wacom_add_shared_data(struct hid_device *hdev) > list_add_tail(&data->list, &wacom_udev_list); > } > > + mutex_unlock(&wacom_udev_list_lock); > + > wacom_wac->shared = &data->shared; > > retval = devm_add_action(&hdev->dev, wacom_remove_shared_data, wacom); > if (retval) { > - mutex_unlock(&wacom_udev_list_lock); > wacom_remove_shared_data(wacom); > return retval; > } > @@ -904,8 +905,6 @@ static int wacom_add_shared_data(struct hid_device *hdev) > else if (wacom_wac->features.device_type & WACOM_DEVICETYPE_PEN) > wacom_wac->shared->pen = hdev; > > -out: > - mutex_unlock(&wacom_udev_list_lock); > return retval; > } > > -- > 2.33.0 >
I tested the set of two patches. I didn't see any issues with them applied. But, while reviewing the patches, I noticed a minor logic mismatch between the current patch and the original code. I'd hope at least one of the maintainers (Jiri, Benjamin, or Dimitry) reviews this patch, especially the part that I commented below, to make sure that we don't trigger any race condition. Thank you Huoqing, Jason, and the maintainer team! > > From 7adc05783c7e3120028d0d089bea224903c24ccd Mon Sep 17 00:00:00 2001 > > From: Jason Gerecke <jason.gerecke@wacom.com> > > Date: Thu, 14 Oct 2021 07:31:31 -0700 > > Subject: [PATCH] RFC: HID: wacom: Shrink critical section in > > `wacom_add_shared_data` > > > > The size of the critical section in this function appears to be larger > > than necessary. The `wacom_udev_list_lock` exists to ensure that one > > interface cannot begin checking if a shared object exists while a second > > interface is doing the same (otherwise both could determine that that no > > object exists yet and create their own independent objects rather than > > sharing just one). It should be safe for the critical section to end > > once a fresly-allocated shared object would be found by other threads > > (i.e., once it has been added to `wacom_udev_list`, which is looped > > over by `wacom_get_hdev_data`). > > > > This commit is a necessary pre-requisite for a later change to swap the > > use of `devm_add_action` with `devm_add_action_or_reset`, which would > > otherwise deadlock in its error case. > > > > Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> > > --- > > drivers/hid/wacom_sys.c | 9 ++++----- > > 1 file changed, 4 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c > > index 93f49b766376..62f50e4b837d 100644 > > --- a/drivers/hid/wacom_sys.c > > +++ b/drivers/hid/wacom_sys.c > > @@ -881,8 +881,8 @@ static int wacom_add_shared_data(struct hid_device *hdev) > > if (!data) { > > data = kzalloc(sizeof(struct wacom_hdev_data), GFP_KERNEL); > > if (!data) { > > - retval = -ENOMEM; > > - goto out; > > + mutex_unlock(&wacom_udev_list_lock); > > + return -ENOMEM; > > } > > > > kref_init(&data->kref); > > @@ -890,11 +890,12 @@ static int wacom_add_shared_data(struct hid_device *hdev) > > list_add_tail(&data->list, &wacom_udev_list); > > } > > > > + mutex_unlock(&wacom_udev_list_lock); > > + > > wacom_wac->shared = &data->shared; > > > > retval = devm_add_action(&hdev->dev, wacom_remove_shared_data, wacom); > > if (retval) { > > - mutex_unlock(&wacom_udev_list_lock); The mutex_unlock was called after devm_add_action is finished, whether it is a failure or success. The new code calls mutex_unlock before devm_add_action is executed. Is that ok? If there is no concern from the maintainers, the patch has been: Reviewed-by: Ping Cheng <ping.cheng@wacom.com> Tested-by: Ping Cheng <ping.cheng@wacom.com> Cheers, Ping > > wacom_remove_shared_data(wacom); > > return retval; > > } > > @@ -904,8 +905,6 @@ static int wacom_add_shared_data(struct hid_device *hdev) > > else if (wacom_wac->features.device_type & WACOM_DEVICETYPE_PEN) > > wacom_wac->shared->pen = hdev; > > > > -out: > > - mutex_unlock(&wacom_udev_list_lock); > > return retval; > > } > > > > -- > > 2.33.0 > > >
Hi Ping, On Sun, Oct 17, 2021 at 02:58:47PM -0700, Ping Cheng wrote: > I tested the set of two patches. I didn't see any issues with them > applied. But, while reviewing the patches, I noticed a minor logic > mismatch between the current patch and the original code. I'd hope at > least one of the maintainers (Jiri, Benjamin, or Dimitry) reviews this > patch, especially the part that I commented below, to make sure that > we don't trigger any race condition. > > Thank you Huoqing, Jason, and the maintainer team! > > > > From 7adc05783c7e3120028d0d089bea224903c24ccd Mon Sep 17 00:00:00 2001 > > > From: Jason Gerecke <jason.gerecke@wacom.com> > > > Date: Thu, 14 Oct 2021 07:31:31 -0700 > > > Subject: [PATCH] RFC: HID: wacom: Shrink critical section in > > > `wacom_add_shared_data` > > > > > > The size of the critical section in this function appears to be larger > > > than necessary. The `wacom_udev_list_lock` exists to ensure that one > > > interface cannot begin checking if a shared object exists while a second > > > interface is doing the same (otherwise both could determine that that no > > > object exists yet and create their own independent objects rather than > > > sharing just one). It should be safe for the critical section to end > > > once a fresly-allocated shared object would be found by other threads > > > (i.e., once it has been added to `wacom_udev_list`, which is looped > > > over by `wacom_get_hdev_data`). > > > > > > This commit is a necessary pre-requisite for a later change to swap the > > > use of `devm_add_action` with `devm_add_action_or_reset`, which would > > > otherwise deadlock in its error case. > > > > > > Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> > > > --- > > > drivers/hid/wacom_sys.c | 9 ++++----- > > > 1 file changed, 4 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c > > > index 93f49b766376..62f50e4b837d 100644 > > > --- a/drivers/hid/wacom_sys.c > > > +++ b/drivers/hid/wacom_sys.c > > > @@ -881,8 +881,8 @@ static int wacom_add_shared_data(struct hid_device *hdev) > > > if (!data) { > > > data = kzalloc(sizeof(struct wacom_hdev_data), GFP_KERNEL); > > > if (!data) { > > > - retval = -ENOMEM; > > > - goto out; > > > + mutex_unlock(&wacom_udev_list_lock); > > > + return -ENOMEM; > > > } > > > > > > kref_init(&data->kref); > > > @@ -890,11 +890,12 @@ static int wacom_add_shared_data(struct hid_device *hdev) > > > list_add_tail(&data->list, &wacom_udev_list); > > > } > > > > > > + mutex_unlock(&wacom_udev_list_lock); > > > + > > > wacom_wac->shared = &data->shared; > > > > > > retval = devm_add_action(&hdev->dev, wacom_remove_shared_data, wacom); > > > if (retval) { > > > - mutex_unlock(&wacom_udev_list_lock); > > The mutex_unlock was called after devm_add_action is finished, whether > it is a failure or success. The new code calls mutex_unlock before > devm_add_action is executed. Is that ok? I think this is OK, but I would prefer if assignments that alter the shared data (i.e. assignment to wacom_wac->shared->pen, etc) would continue stay under mutex protection, so they need to be pulled up. With that you can add my Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> to the both patches, provided that Jason's comes first. Thanks.
On Sun, 17 Oct 2021, Ping Cheng wrote: > I tested the set of two patches. I didn't see any issues with them > applied. But, while reviewing the patches, I noticed a minor logic > mismatch between the current patch and the original code. I'd hope at > least one of the maintainers (Jiri, Benjamin, or Dimitry) reviews this > patch, especially the part that I commented below, to make sure that > we don't trigger any race condition. I don't see any issue with that ordering, but I'd also prefer for clarity to keep updating the shared data structure under the mutex protection. With that, please send me the series with both patches and the Acks / Review-by accumulated, and I'll apply it. Thanks,
On Tue, Oct 19, 2021 at 8:36 AM Jason Gerecke <killertofu@gmail.com> wrote: > > On Sun, Oct 17, 2021 at 9:53 PM Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote: >> >> I think this is OK, but I would prefer if assignments that alter the >> >> shared data (i.e. assignment to wacom_wac->shared->pen, etc) would >> continue stay under mutex protection, so they need to be pulled up. > > > > On Mon, Oct 18, 2021 at 8:26 AM Jiri Kosina <jikos@kernel.org> wrote: >> >> I don't see any issue with that ordering, but I'd also prefer for clarity >> to keep updating the shared data structure under the mutex protection. >> > > The data behind the "shared" struct (e.g. wacom_wac->shared->pen) is not currently under any mutex protection. I don't think mutex protection is necessary, but we can take a look... I believe all of its members are either flags (so already atomic) or initialized during probe and then just used as a handle with appropriate NULL checks (but maybe two threads could be simultaneously issuing events to the same device?). > > If a patch to add mutex protection to the shared struct is necessary, that's going to be a seperate patch that touches a lot more of the driver. Following up on this. I took a second look at the shared struct, and believe that things should work fine during initialization and steady-state. There are, however, opportunities for e.g. one device/thread to be removed and set e.g. `shared->touch = NULL` while a second device/thread is attempting to send an event out of that device. This is going to be very rare and only on disconnect, which is probably why we've never received reports of real-world issues. This shared issue is present with or without the changes by Cai and myself. I would ask that these two patches be merged while we look at introducing a new mutex to protect the contents of the shared pointer. Jason --- Now instead of four in the eights place / you’ve got three, ‘Cause you added one / (That is to say, eight) to the two, / But you can’t take seven from three, / So you look at the sixty-fours....
On Tue, 26 Oct 2021, Jason Gerecke wrote: > Following up on this. I took a second look at the shared struct, and > believe that things should work fine during initialization and > steady-state. There are, however, opportunities for e.g. one > device/thread to be removed and set e.g. `shared->touch = NULL` while a > second device/thread is attempting to send an event out of that device. > This is going to be very rare and only on disconnect, which is probably > why we've never received reports of real-world issues. > > This shared issue is present with or without the changes by Cai and > myself. I would ask that these two patches be merged Now applied. Thanks,
diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c index 93f49b766376..3aed7ba249f7 100644 --- a/drivers/hid/wacom_sys.c +++ b/drivers/hid/wacom_sys.c @@ -892,10 +892,9 @@ static int wacom_add_shared_data(struct hid_device *hdev) wacom_wac->shared = &data->shared; - retval = devm_add_action(&hdev->dev, wacom_remove_shared_data, wacom); + retval = devm_add_action_or_reset(&hdev->dev, wacom_remove_shared_data, wacom); if (retval) { mutex_unlock(&wacom_udev_list_lock); - wacom_remove_shared_data(wacom); return retval; }
The helper function devm_add_action_or_reset() will internally call devm_add_action(), and if devm_add_action() fails then it will execute the action mentioned and return the error code. So use devm_add_action_or_reset() instead of devm_add_action() to simplify the error handling, reduce the code. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> --- drivers/hid/wacom_sys.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)