Message ID | 1714658761-15326-1-git-send-email-quic_zijuhu@quicinc.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v1] Bluetooth: qca: Fix BT enable failure again for QCA6390 after warm reboot | expand |
Context | Check | Description |
---|---|---|
tedd_an/pre-ci_am | success | Success |
tedd_an/CheckPatch | success | CheckPatch PASS |
tedd_an/GitLint | success | Gitlint PASS |
tedd_an/SubjectPrefix | success | Gitlint PASS |
tedd_an/BuildKernel | success | BuildKernel PASS |
tedd_an/CheckAllWarning | success | CheckAllWarning PASS |
tedd_an/CheckSparse | success | CheckSparse PASS |
tedd_an/CheckSmatch | fail | CheckSparse: FAIL: Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: net/bluetooth/hci_core.o] Error 139 make[4]: *** Deleting file 'net/bluetooth/hci_core.o' make[3]: *** [scripts/Makefile.build:485: net/bluetooth] Error 2 make[2]: *** [scripts/Makefile.build:485: net] Error 2 make[2]: *** Waiting for unfinished jobs.... Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bcm203x.o] Error 139 make[4]: *** Deleting file 'drivers/bluetooth/bcm203x.o' make[4]: *** Waiting for unfinished jobs.... Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bpa10x.o] Error 139 make[4]: *** Deleting file 'drivers/bluetooth/bpa10x.o' make[3]: *** [scripts/Makefile.build:485: drivers/bluetooth] Error 2 make[2]: *** [scripts/Makefile.build:485: drivers] Error 2 make[1]: *** [/github/workspace/src/src/Makefile:1919: .] Error 2 make: *** [Makefile:240: __sub-make] Error 2 |
tedd_an/BuildKernel32 | success | BuildKernel32 PASS |
tedd_an/TestRunnerSetup | success | TestRunnerSetup PASS |
tedd_an/TestRunner_l2cap-tester | success | TestRunner PASS |
tedd_an/TestRunner_iso-tester | success | TestRunner PASS |
tedd_an/TestRunner_bnep-tester | success | TestRunner PASS |
tedd_an/TestRunner_mgmt-tester | success | TestRunner PASS |
tedd_an/TestRunner_rfcomm-tester | success | TestRunner PASS |
tedd_an/TestRunner_sco-tester | success | TestRunner PASS |
tedd_an/TestRunner_ioctl-tester | success | TestRunner PASS |
tedd_an/TestRunner_mesh-tester | success | TestRunner PASS |
tedd_an/TestRunner_smp-tester | success | TestRunner PASS |
tedd_an/TestRunner_userchan-tester | success | TestRunner PASS |
tedd_an/IncrementalBuild | success | Incremental Build PASS |
This is automated email and please do not reply to this email! Dear submitter, Thank you for submitting the patches to the linux bluetooth mailing list. This is a CI test results with your patch series: PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=849912 ---Test result--- Test Summary: CheckPatch PASS 0.49 seconds GitLint PASS 0.19 seconds SubjectPrefix PASS 0.06 seconds BuildKernel PASS 30.95 seconds CheckAllWarning PASS 34.14 seconds CheckSparse PASS 43.14 seconds CheckSmatch FAIL 37.86 seconds BuildKernel32 PASS 29.85 seconds TestRunnerSetup PASS 546.06 seconds TestRunner_l2cap-tester PASS 18.60 seconds TestRunner_iso-tester PASS 31.83 seconds TestRunner_bnep-tester PASS 4.77 seconds TestRunner_mgmt-tester PASS 113.61 seconds TestRunner_rfcomm-tester PASS 7.43 seconds TestRunner_sco-tester PASS 15.15 seconds TestRunner_ioctl-tester PASS 7.94 seconds TestRunner_mesh-tester PASS 5.89 seconds TestRunner_smp-tester PASS 6.74 seconds TestRunner_userchan-tester PASS 4.87 seconds IncrementalBuild PASS 28.61 seconds Details ############################## Test: CheckSmatch - FAIL Desc: Run smatch tool with source Output: Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: net/bluetooth/hci_core.o] Error 139 make[4]: *** Deleting file 'net/bluetooth/hci_core.o' make[3]: *** [scripts/Makefile.build:485: net/bluetooth] Error 2 make[2]: *** [scripts/Makefile.build:485: net] Error 2 make[2]: *** Waiting for unfinished jobs.... Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bcm203x.o] Error 139 make[4]: *** Deleting file 'drivers/bluetooth/bcm203x.o' make[4]: *** Waiting for unfinished jobs.... Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bpa10x.o] Error 139 make[4]: *** Deleting file 'drivers/bluetooth/bpa10x.o' make[3]: *** [scripts/Makefile.build:485: drivers/bluetooth] Error 2 make[2]: *** [scripts/Makefile.build:485: drivers] Error 2 make[1]: *** [/github/workspace/src/src/Makefile:1919: .] Error 2 make: *** [Makefile:240: __sub-make] Error 2 --- Regards, Linux Bluetooth
On 02/05/2024 16:06, Zijun Hu wrote: > Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed > serdev") will cause below regression issue: > > BT can't be enabled after below steps: > cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure > if property enable-gpios is not configured within DT|ACPI for QCA6390. > > The commit is to fix a use-after-free issue within qca_serdev_shutdown() > during reboot, but also introduces this regression issue regarding above > steps since the VSC is not sent to reset controller during warm reboot. > > Fixed by sending the VSC to reset controller within qca_serdev_shutdown() > once BT was ever enabled, and the use-after-free issue is also be fixed > by this change since serdev is still opened when send to serdev. > > Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") > Cc: stable@vger.kernel.org > Reported-by: Wren Turkal <wt@penguintechs.org> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 > Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> > Tested-by: Wren Turkal <wt@penguintechs.org> > --- > drivers/bluetooth/hci_qca.c | 5 ++--- I don't think this is v1. Version your patches properly and provide changelog. I asked already *two times*: 1. On which kernel did you test it? 2. On which hardware did you test it? I am not interested in any replies like "I wrote something on bugzilla". I am really fed up with your elusive, time-wasting replies, so be specific here. Best regards, Krzysztof
On 5/3/2024 6:16 PM, Krzysztof Kozlowski wrote: > On 02/05/2024 16:06, Zijun Hu wrote: >> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed >> serdev") will cause below regression issue: >> >> BT can't be enabled after below steps: >> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure >> if property enable-gpios is not configured within DT|ACPI for QCA6390. >> >> The commit is to fix a use-after-free issue within qca_serdev_shutdown() >> during reboot, but also introduces this regression issue regarding above >> steps since the VSC is not sent to reset controller during warm reboot. >> >> Fixed by sending the VSC to reset controller within qca_serdev_shutdown() >> once BT was ever enabled, and the use-after-free issue is also be fixed >> by this change since serdev is still opened when send to serdev. >> >> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") >> Cc: stable@vger.kernel.org >> Reported-by: Wren Turkal <wt@penguintechs.org> >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 >> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> >> Tested-by: Wren Turkal <wt@penguintechs.org> >> --- >> drivers/bluetooth/hci_qca.c | 5 ++--- > > I don't think this is v1. Version your patches properly and provide > changelog. > i sent it as v1 to start a new and clean discussion. > I asked already *two times*: > 1. On which kernel did you test it? > 2. On which hardware did you test it? > will provide such info within next commit message. > I am not interested in any replies like "I wrote something on bugzilla". > I am really fed up with your elusive, time-wasting replies, so be > specific here. > are there any other concerns about this patch itself? > Best regards, > Krzysztof >
Hi Zijun, On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote: > > Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed > serdev") will cause below regression issue: > > BT can't be enabled after below steps: > cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure > if property enable-gpios is not configured within DT|ACPI for QCA6390. > > The commit is to fix a use-after-free issue within qca_serdev_shutdown() > during reboot, but also introduces this regression issue regarding above > steps since the VSC is not sent to reset controller during warm reboot. > > Fixed by sending the VSC to reset controller within qca_serdev_shutdown() > once BT was ever enabled, and the use-after-free issue is also be fixed > by this change since serdev is still opened when send to serdev. > > Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") > Cc: stable@vger.kernel.org > Reported-by: Wren Turkal <wt@penguintechs.org> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 > Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> > Tested-by: Wren Turkal <wt@penguintechs.org> > --- > drivers/bluetooth/hci_qca.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c > index 0c9c9ee56592..8e35c9091486 100644 > --- a/drivers/bluetooth/hci_qca.c > +++ b/drivers/bluetooth/hci_qca.c > @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev) > struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); > struct hci_uart *hu = &qcadev->serdev_hu; > struct hci_dev *hdev = hu->hdev; > - struct qca_data *qca = hu->priv; > const u8 ibs_wake_cmd[] = { 0xFD }; > const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; > > if (qcadev->btsoc_type == QCA_QCA6390) { > - if (test_bit(QCA_BT_OFF, &qca->flags) || > - !test_bit(HCI_RUNNING, &hdev->flags)) This probably deserves a comment on why you end up with HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you are removing the flags above since that was introduce to prevent use-after-free this sort of revert it so I do wonder how serdev can still be open if you haven't tested for QCA_BT_OFF for example? > + if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) || > + hci_dev_test_flag(hdev, HCI_SETUP)) > return; > > serdev_device_write_flush(serdev); > -- > 2.7.4 >
On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote: > Hi Zijun, > > On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote: >> >> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed >> serdev") will cause below regression issue: >> >> BT can't be enabled after below steps: >> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure >> if property enable-gpios is not configured within DT|ACPI for QCA6390. >> >> The commit is to fix a use-after-free issue within qca_serdev_shutdown() >> during reboot, but also introduces this regression issue regarding above >> steps since the VSC is not sent to reset controller during warm reboot. >> >> Fixed by sending the VSC to reset controller within qca_serdev_shutdown() >> once BT was ever enabled, and the use-after-free issue is also be fixed >> by this change since serdev is still opened when send to serdev. >> >> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") >> Cc: stable@vger.kernel.org >> Reported-by: Wren Turkal <wt@penguintechs.org> >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 >> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> >> Tested-by: Wren Turkal <wt@penguintechs.org> >> --- >> drivers/bluetooth/hci_qca.c | 5 ++--- >> 1 file changed, 2 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c >> index 0c9c9ee56592..8e35c9091486 100644 >> --- a/drivers/bluetooth/hci_qca.c >> +++ b/drivers/bluetooth/hci_qca.c >> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev) >> struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); >> struct hci_uart *hu = &qcadev->serdev_hu; >> struct hci_dev *hdev = hu->hdev; >> - struct qca_data *qca = hu->priv; >> const u8 ibs_wake_cmd[] = { 0xFD }; >> const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; >> >> if (qcadev->btsoc_type == QCA_QCA6390) { >> - if (test_bit(QCA_BT_OFF, &qca->flags) || >> - !test_bit(HCI_RUNNING, &hdev->flags)) > > This probably deserves a comment on why you end up with > HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you > are removing the flags above since that was introduce to prevent > use-after-free this sort of revert it so I do wonder how serdev can > still be open if you haven't tested for QCA_BT_OFF for example? > okay, let me give comments at next version. this design logic is shown below. you maybe review it. if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup() is able to be invoked by every open() to initializate SoC without any help. so we don't need to send the VSC to reset SoC into initial and clean state for the next hdev->setup() call success. otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set. if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the SOC is already in the initial and clean state, so we also don't need to send the VSC to reset SOC. otherwise, we need to send the VSC to reset Soc into a initial and clean state for hdev->setup() call success after "warm reboot -> enable BT" for the case commit message cares about, the only factor which decide to send the VSC is that SoC is a initial and clean state or not after warm reboot, any other factors are irrelevant to this decision. why the serdev is still open after go through (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) || hci_dev_test_flag(hdev, HCI_SETUP) checking is that serdev is not closed by hci_uart_close(). see hci_uart_close() within drivers/bluetooth/hci_serdev.c static int hci_uart_close(struct hci_dev *hdev) { ...... /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver, * BT SOC is completely powered OFF during BT OFF, holding port * open may drain the battery. */ if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) { clear_bit(HCI_UART_PROTO_READY, &hu->flags); serdev_device_close(hu->serdev); } return 0; } >> + if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) || >> + hci_dev_test_flag(hdev, HCI_SETUP)) >> return; >> >> serdev_device_write_flush(serdev); >> -- >> 2.7.4 >> > >
Hi, On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> wrote: > > On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote: > > Hi Zijun, > > > > On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote: > >> > >> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed > >> serdev") will cause below regression issue: > >> > >> BT can't be enabled after below steps: > >> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure > >> if property enable-gpios is not configured within DT|ACPI for QCA6390. > >> > >> The commit is to fix a use-after-free issue within qca_serdev_shutdown() > >> during reboot, but also introduces this regression issue regarding above > >> steps since the VSC is not sent to reset controller during warm reboot. > >> > >> Fixed by sending the VSC to reset controller within qca_serdev_shutdown() > >> once BT was ever enabled, and the use-after-free issue is also be fixed > >> by this change since serdev is still opened when send to serdev. > >> > >> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") > >> Cc: stable@vger.kernel.org > >> Reported-by: Wren Turkal <wt@penguintechs.org> > >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 > >> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> > >> Tested-by: Wren Turkal <wt@penguintechs.org> > >> --- > >> drivers/bluetooth/hci_qca.c | 5 ++--- > >> 1 file changed, 2 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c > >> index 0c9c9ee56592..8e35c9091486 100644 > >> --- a/drivers/bluetooth/hci_qca.c > >> +++ b/drivers/bluetooth/hci_qca.c > >> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev) > >> struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); > >> struct hci_uart *hu = &qcadev->serdev_hu; > >> struct hci_dev *hdev = hu->hdev; > >> - struct qca_data *qca = hu->priv; > >> const u8 ibs_wake_cmd[] = { 0xFD }; > >> const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; > >> > >> if (qcadev->btsoc_type == QCA_QCA6390) { > >> - if (test_bit(QCA_BT_OFF, &qca->flags) || > >> - !test_bit(HCI_RUNNING, &hdev->flags)) > > > > This probably deserves a comment on why you end up with > > HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you > > are removing the flags above since that was introduce to prevent > > use-after-free this sort of revert it so I do wonder how serdev can > > still be open if you haven't tested for QCA_BT_OFF for example? > > > okay, let me give comments at next version. > this design logic is shown below. you maybe review it. > > if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup() > is able to be invoked by every open() to initializate SoC without any > help. so we don't need to send the VSC to reset SoC into initial and > clean state for the next hdev->setup() call success. > > otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set. > > if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the > SOC is already in the initial and clean state, so we also don't need to > send the VSC to reset SOC. > > otherwise, we need to send the VSC to reset Soc into a initial and clean > state for hdev->setup() call success after "warm reboot -> enable BT" > > for the case commit message cares about, the only factor which decide to > send the VSC is that SoC is a initial and clean state or not after warm > reboot, any other factors are irrelevant to this decision. > > why the serdev is still open after go through > (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) > || hci_dev_test_flag(hdev, HCI_SETUP) checking is that > serdev is not closed by hci_uart_close(). Sounds like a logical jump to me, in fact hci_uart_close doesn't really change any of those flags, beside these flags are not really meant to tell the driver if serdev_device_close has been called or not which seems to be the intention with HCI_UART_PROTO_READY so how about we use that instead? Another thing that is troubling me is that having traffic on shutdown is not common, specially if you are going to reboot, etc, and even if it doesn't get power cycle why don't you reset on probe rather than shutdown? That way we don't have to depend on what has been done in a previous boot, which can really become a problem in case of multi-OS where you have another system that may not be doing what you expect. > see hci_uart_close() within drivers/bluetooth/hci_serdev.c > static int hci_uart_close(struct hci_dev *hdev) > { > ...... > /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver, > * BT SOC is completely powered OFF during BT OFF, holding port > * open may drain the battery. > */ > if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) { > clear_bit(HCI_UART_PROTO_READY, &hu->flags); > serdev_device_close(hu->serdev); > } > > return 0; > } > > >> + if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) || > >> + hci_dev_test_flag(hdev, HCI_SETUP)) > >> return; > >> > >> serdev_device_write_flush(serdev); > >> -- > >> 2.7.4 > >> > > > > >
On 5/4/2024 5:25 AM, Luiz Augusto von Dentz wrote: > Hi, > > On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> wrote: >> >> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote: >>> Hi Zijun, >>> >>> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote: >>>> >>>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed >>>> serdev") will cause below regression issue: >>>> >>>> BT can't be enabled after below steps: >>>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure >>>> if property enable-gpios is not configured within DT|ACPI for QCA6390. >>>> >>>> The commit is to fix a use-after-free issue within qca_serdev_shutdown() >>>> during reboot, but also introduces this regression issue regarding above >>>> steps since the VSC is not sent to reset controller during warm reboot. >>>> >>>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown() >>>> once BT was ever enabled, and the use-after-free issue is also be fixed >>>> by this change since serdev is still opened when send to serdev. >>>> >>>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") >>>> Cc: stable@vger.kernel.org >>>> Reported-by: Wren Turkal <wt@penguintechs.org> >>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 >>>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> >>>> Tested-by: Wren Turkal <wt@penguintechs.org> >>>> --- >>>> drivers/bluetooth/hci_qca.c | 5 ++--- >>>> 1 file changed, 2 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c >>>> index 0c9c9ee56592..8e35c9091486 100644 >>>> --- a/drivers/bluetooth/hci_qca.c >>>> +++ b/drivers/bluetooth/hci_qca.c >>>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev) >>>> struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); >>>> struct hci_uart *hu = &qcadev->serdev_hu; >>>> struct hci_dev *hdev = hu->hdev; >>>> - struct qca_data *qca = hu->priv; >>>> const u8 ibs_wake_cmd[] = { 0xFD }; >>>> const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; >>>> >>>> if (qcadev->btsoc_type == QCA_QCA6390) { >>>> - if (test_bit(QCA_BT_OFF, &qca->flags) || >>>> - !test_bit(HCI_RUNNING, &hdev->flags)) >>> >>> This probably deserves a comment on why you end up with >>> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you >>> are removing the flags above since that was introduce to prevent >>> use-after-free this sort of revert it so I do wonder how serdev can >>> still be open if you haven't tested for QCA_BT_OFF for example? >>> >> okay, let me give comments at next version. >> this design logic is shown below. you maybe review it. >> >> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup() >> is able to be invoked by every open() to initializate SoC without any >> help. so we don't need to send the VSC to reset SoC into initial and >> clean state for the next hdev->setup() call success. >> >> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set. >> >> if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the >> SOC is already in the initial and clean state, so we also don't need to >> send the VSC to reset SOC. >> >> otherwise, we need to send the VSC to reset Soc into a initial and clean >> state for hdev->setup() call success after "warm reboot -> enable BT" >> >> for the case commit message cares about, the only factor which decide to >> send the VSC is that SoC is a initial and clean state or not after warm >> reboot, any other factors are irrelevant to this decision. >> >> why the serdev is still open after go through >> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) >> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that >> serdev is not closed by hci_uart_close(). > > Sounds like a logical jump to me, in fact hci_uart_close doesn't > really change any of those flags, beside these flags are not really > meant to tell the driver if serdev_device_close has been called or not > which seems to be the intention with HCI_UART_PROTO_READY so how about > we use that instead? > sorry for that i maybe not give good explanation, let me explain again. hci_uart_close() is the only point which maybe close serdev before qca_serdev_shutdown() is called, but for our case that HCI_QUIRK_NON_PERSISTENT_SETUP is NOT set, hci_uart_close() will not close serdev for our case, so serdev must be open state before sending the VSC. so should not need other checking. > Another thing that is troubling me is that having traffic on shutdown > is not common, specially if you are going to reboot, etc, and even if > it doesn't get power cycle why don't you reset on probe rather than > shutdown? That way we don't have to depend on what has been done in a > previous boot, which can really become a problem in case of multi-OS > where you have another system that may not be doing what you expect. as you know, BT UART are working at 3M baudrate for normal usage. we can't distinguish if SoC expects 3M or default 11.52K baudarate during probe() after reboot. so we send the VSC within shutdown to make sure SoC enter a initial state with 11.52 baudrate. for cold boot, SOC expects default 11.52K baudrate for probe(). for Enable BT -> warm boot, SOC expects 3M baudrate for probe(). we can't tell these two case within probe(). so need to send the VSC within shutdown(). >> see hci_uart_close() within drivers/bluetooth/hci_serdev.c >> static int hci_uart_close(struct hci_dev *hdev) >> { >> ...... >> /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver, >> * BT SOC is completely powered OFF during BT OFF, holding port >> * open may drain the battery. >> */ >> if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) { >> clear_bit(HCI_UART_PROTO_READY, &hu->flags); >> serdev_device_close(hu->serdev); >> } >> >> return 0; >> } >> >>>> + if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) || >>>> + hci_dev_test_flag(hdev, HCI_SETUP)) >>>> return; >>>> >>>> serdev_device_write_flush(serdev); >>>> -- >>>> 2.7.4 >>>> >>> >>> >> > >
On 2024/5/4 05:51, quic_zijuhu wrote: > On 5/4/2024 5:25 AM, Luiz Augusto von Dentz wrote: >> Hi, >> >> On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> wrote: >>> >>> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote: >>>> Hi Zijun, >>>> >>>> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote: >>>>> >>>>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed >>>>> serdev") will cause below regression issue: >>>>> >>>>> BT can't be enabled after below steps: >>>>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure >>>>> if property enable-gpios is not configured within DT|ACPI for QCA6390. >>>>> >>>>> The commit is to fix a use-after-free issue within qca_serdev_shutdown() >>>>> during reboot, but also introduces this regression issue regarding above >>>>> steps since the VSC is not sent to reset controller during warm reboot. >>>>> >>>>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown() >>>>> once BT was ever enabled, and the use-after-free issue is also be fixed >>>>> by this change since serdev is still opened when send to serdev. >>>>> >>>>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") >>>>> Cc: stable@vger.kernel.org >>>>> Reported-by: Wren Turkal <wt@penguintechs.org> >>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 >>>>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> >>>>> Tested-by: Wren Turkal <wt@penguintechs.org> >>>>> --- >>>>> drivers/bluetooth/hci_qca.c | 5 ++--- >>>>> 1 file changed, 2 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c >>>>> index 0c9c9ee56592..8e35c9091486 100644 >>>>> --- a/drivers/bluetooth/hci_qca.c >>>>> +++ b/drivers/bluetooth/hci_qca.c >>>>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev) >>>>> struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); >>>>> struct hci_uart *hu = &qcadev->serdev_hu; >>>>> struct hci_dev *hdev = hu->hdev; >>>>> - struct qca_data *qca = hu->priv; >>>>> const u8 ibs_wake_cmd[] = { 0xFD }; >>>>> const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; >>>>> >>>>> if (qcadev->btsoc_type == QCA_QCA6390) { >>>>> - if (test_bit(QCA_BT_OFF, &qca->flags) || >>>>> - !test_bit(HCI_RUNNING, &hdev->flags)) >>>> >>>> This probably deserves a comment on why you end up with >>>> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you >>>> are removing the flags above since that was introduce to prevent >>>> use-after-free this sort of revert it so I do wonder how serdev can >>>> still be open if you haven't tested for QCA_BT_OFF for example? >>>> >>> okay, let me give comments at next version. >>> this design logic is shown below. you maybe review it. >>> >>> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup() >>> is able to be invoked by every open() to initializate SoC without any >>> help. so we don't need to send the VSC to reset SoC into initial and >>> clean state for the next hdev->setup() call success. >>> >>> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set. >>> >>> if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the >>> SOC is already in the initial and clean state, so we also don't need to >>> send the VSC to reset SOC. >>> >>> otherwise, we need to send the VSC to reset Soc into a initial and clean >>> state for hdev->setup() call success after "warm reboot -> enable BT" >>> >>> for the case commit message cares about, the only factor which decide to >>> send the VSC is that SoC is a initial and clean state or not after warm >>> reboot, any other factors are irrelevant to this decision. >>> >>> why the serdev is still open after go through >>> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) >>> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that >>> serdev is not closed by hci_uart_close(). >> >> Sounds like a logical jump to me, in fact hci_uart_close doesn't >> really change any of those flags, beside these flags are not really >> meant to tell the driver if serdev_device_close has been called or not >> which seems to be the intention with HCI_UART_PROTO_READY so how about >> we use that instead? >> > sorry for that i maybe not give good explanation, let me explain again. > hci_uart_close() is the only point which maybe close serdev before > qca_serdev_shutdown() is called, but for our case that > HCI_QUIRK_NON_PERSISTENT_SETUP is NOT set, hci_uart_close() will not > close serdev for our case, so serdev must be open state before sending > the VSC. so should not need other checking. > hello, i have paid attention to your discussion for a long time, i would like to join this discussion now. The serdev is still open before sending the VSC for this patch. are you agree with above Zijun's point? >> Another thing that is troubling me is that having traffic on shutdown >> is not common, specially if you are going to reboot, etc, and even if >> it doesn't get power cycle why don't you reset on probe rather than >> shutdown? That way we don't have to depend on what has been done in a >> previous boot, which can really become a problem in case of multi-OS >> where you have another system that may not be doing what you expect. > as you know, BT UART are working at 3M baudrate for normal usage. > we can't distinguish if SoC expects 3M or default 11.52K baudarate > during probe() after reboot. so we send the VSC within shutdown to make > sure SoC enter a initial state with 11.52 baudrate. > > for cold boot, SOC expects default 11.52K baudrate for probe(). > for Enable BT -> warm boot, SOC expects 3M baudrate for probe(). > we can't tell these two case within probe(). so need to send the VSC > within shutdown(). > it seems the traffic within qca_serdev_shutdown() actually does software reset for BT SOC. From Zijun's points. the reasons why to do software reset within shutdown() instead of probe() maybe be shown below 1) it is impossible to do software reset within probe(). 2) it seems it is easier to do it within shutdown() than probe. Zijun's simple fix only change the condition and does NOT change the location to send the VSC, i think it maybe be other topic about location where(probe() or shutdown()) to do software reset. are you agree with this point? For concern about multi-OS, i would like to show my points. the patch is for linux kernel, we maybe only need to care about linux OS. it maybe out-of-scope to make assumptions about other OSs vendor announced supported such as Windows OS we don't known much about. are you agree with this point? >>> see hci_uart_close() within drivers/bluetooth/hci_serdev.c >>> static int hci_uart_close(struct hci_dev *hdev) >>> { >>> ...... >>> /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver, >>> * BT SOC is completely powered OFF during BT OFF, holding port >>> * open may drain the battery. >>> */ >>> if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) { >>> clear_bit(HCI_UART_PROTO_READY, &hu->flags); >>> serdev_device_close(hu->serdev); >>> } >>> >>> return 0; >>> } >>> >>>>> + if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) || >>>>> + hci_dev_test_flag(hdev, HCI_SETUP)) >>>>> return; >>>>> >>>>> serdev_device_write_flush(serdev); >>>>> -- >>>>> 2.7.4 >>>>> >>>> >>>> >>> >> >> > > >
On 5/3/24 3:16 AM, Krzysztof Kozlowski wrote: > On 02/05/2024 16:06, Zijun Hu wrote: >> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed >> serdev") will cause below regression issue: >> >> BT can't be enabled after below steps: >> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure >> if property enable-gpios is not configured within DT|ACPI for QCA6390. >> >> The commit is to fix a use-after-free issue within qca_serdev_shutdown() >> during reboot, but also introduces this regression issue regarding above >> steps since the VSC is not sent to reset controller during warm reboot. >> >> Fixed by sending the VSC to reset controller within qca_serdev_shutdown() >> once BT was ever enabled, and the use-after-free issue is also be fixed >> by this change since serdev is still opened when send to serdev. >> >> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") >> Cc: stable@vger.kernel.org >> Reported-by: Wren Turkal <wt@penguintechs.org> >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 >> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> >> Tested-by: Wren Turkal <wt@penguintechs.org> >> --- >> drivers/bluetooth/hci_qca.c | 5 ++--- > > I don't think this is v1. Version your patches properly and provide > changelog. > > I asked already *two times*: > 1. On which kernel did you test it? > 2. On which hardware did you test it? I thought I had already chimed in with this information. I am using a Dell XPS 13 9310. It's the only hardware I have access to. I can say that the fix seems to work as advertised in that it fixes the warm boot issue I have been experiencing. As a user, I vastly prefer a one time cold boot rather than having to do it every time. As such, I can do the test. However, I do not think that the patch not working in that case should block landing logic that empirically improved my experience with the hardware and minimizes the corner case to a much smaller corner (that of warm booting from an OS that improperly puts the device in a bad state to a Linux kernel with this UX improvement). wt
On 5/7/24 6:48 AM, Lk Sii wrote: > On 2024/5/4 05:51, quic_zijuhu wrote: >> On 5/4/2024 5:25 AM, Luiz Augusto von Dentz wrote: >>> Hi, >>> >>> On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> wrote: >>>> >>>> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote: >>>>> Hi Zijun, >>>>> >>>>> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote: >>>>>> >>>>>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed >>>>>> serdev") will cause below regression issue: >>>>>> >>>>>> BT can't be enabled after below steps: >>>>>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure >>>>>> if property enable-gpios is not configured within DT|ACPI for QCA6390. >>>>>> >>>>>> The commit is to fix a use-after-free issue within qca_serdev_shutdown() >>>>>> during reboot, but also introduces this regression issue regarding above >>>>>> steps since the VSC is not sent to reset controller during warm reboot. >>>>>> >>>>>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown() >>>>>> once BT was ever enabled, and the use-after-free issue is also be fixed >>>>>> by this change since serdev is still opened when send to serdev. >>>>>> >>>>>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev") >>>>>> Cc: stable@vger.kernel.org >>>>>> Reported-by: Wren Turkal <wt@penguintechs.org> >>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 >>>>>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> >>>>>> Tested-by: Wren Turkal <wt@penguintechs.org> >>>>>> --- >>>>>> drivers/bluetooth/hci_qca.c | 5 ++--- >>>>>> 1 file changed, 2 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c >>>>>> index 0c9c9ee56592..8e35c9091486 100644 >>>>>> --- a/drivers/bluetooth/hci_qca.c >>>>>> +++ b/drivers/bluetooth/hci_qca.c >>>>>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev) >>>>>> struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); >>>>>> struct hci_uart *hu = &qcadev->serdev_hu; >>>>>> struct hci_dev *hdev = hu->hdev; >>>>>> - struct qca_data *qca = hu->priv; >>>>>> const u8 ibs_wake_cmd[] = { 0xFD }; >>>>>> const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; >>>>>> >>>>>> if (qcadev->btsoc_type == QCA_QCA6390) { >>>>>> - if (test_bit(QCA_BT_OFF, &qca->flags) || >>>>>> - !test_bit(HCI_RUNNING, &hdev->flags)) >>>>> >>>>> This probably deserves a comment on why you end up with >>>>> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you >>>>> are removing the flags above since that was introduce to prevent >>>>> use-after-free this sort of revert it so I do wonder how serdev can >>>>> still be open if you haven't tested for QCA_BT_OFF for example? >>>>> >>>> okay, let me give comments at next version. >>>> this design logic is shown below. you maybe review it. >>>> >>>> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup() >>>> is able to be invoked by every open() to initializate SoC without any >>>> help. so we don't need to send the VSC to reset SoC into initial and >>>> clean state for the next hdev->setup() call success. >>>> >>>> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set. >>>> >>>> if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the >>>> SOC is already in the initial and clean state, so we also don't need to >>>> send the VSC to reset SOC. >>>> >>>> otherwise, we need to send the VSC to reset Soc into a initial and clean >>>> state for hdev->setup() call success after "warm reboot -> enable BT" >>>> >>>> for the case commit message cares about, the only factor which decide to >>>> send the VSC is that SoC is a initial and clean state or not after warm >>>> reboot, any other factors are irrelevant to this decision. >>>> >>>> why the serdev is still open after go through >>>> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) >>>> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that >>>> serdev is not closed by hci_uart_close(). >>> >>> Sounds like a logical jump to me, in fact hci_uart_close doesn't >>> really change any of those flags, beside these flags are not really >>> meant to tell the driver if serdev_device_close has been called or not >>> which seems to be the intention with HCI_UART_PROTO_READY so how about >>> we use that instead? >>> >> sorry for that i maybe not give good explanation, let me explain again. >> hci_uart_close() is the only point which maybe close serdev before >> qca_serdev_shutdown() is called, but for our case that >> HCI_QUIRK_NON_PERSISTENT_SETUP is NOT set, hci_uart_close() will not >> close serdev for our case, so serdev must be open state before sending >> the VSC. so should not need other checking. >> > hello, i have paid attention to your discussion for a long time, i would > like to join this discussion now. > > The serdev is still open before sending the VSC for this patch. > > are you agree with above Zijun's point? I will say that this part of the discussion seems to be addressing KK's concerns. @KK, is this accurate? @Zijun, this description along with the info about the baud rate issues should probably be part of the commit message. In these last two messages, you have been much clearer about why this logic is needed and correct. I wish you'd provided this description from the beginning when KK asked for more information about the logic change. >>> Another thing that is troubling me is that having traffic on shutdown >>> is not common, specially if you are going to reboot, etc, and even if >>> it doesn't get power cycle why don't you reset on probe rather than >>> shutdown? That way we don't have to depend on what has been done in a >>> previous boot, which can really become a problem in case of multi-OS >>> where you have another system that may not be doing what you expect. >> as you know, BT UART are working at 3M baudrate for normal usage. >> we can't distinguish if SoC expects 3M or default 11.52K baudarate >> during probe() after reboot. so we send the VSC within shutdown to make >> sure SoC enter a initial state with 11.52 baudrate. >> >> for cold boot, SOC expects default 11.52K baudrate for probe(). >> for Enable BT -> warm boot, SOC expects 3M baudrate for probe(). >> we can't tell these two case within probe(). so need to send the VSC >> within shutdown(). >> > it seems the traffic within qca_serdev_shutdown() actually does software > reset for BT SOC. > > From Zijun's points. the reasons why to do software reset within > shutdown() instead of probe() maybe be shown below > 1) it is impossible to do software reset within probe(). > 2) it seems it is easier to do it within shutdown() than probe. > > Zijun's simple fix only change the condition and does NOT change the > location to send the VSC, i think it maybe be other topic about location > where(probe() or shutdown()) to do software reset. > > are you agree with this point? From a practical standpoint, this change does seem to fix the warm boot issue on my laptop. I do not think that it would fix the issue of booting from an OS that puts the hardware into an unknown state. > For concern about multi-OS, i would like to show my points. > > the patch is for linux kernel, we maybe only need to care about linux > OS. it maybe out-of-scope to make assumptions about other OSs vendor > announced supported such as Windows OS we don't known much about. > > are you agree with this point? I would not fully agree with this point. However, I would agree completely with your proposal for moving forward (i.e. landing the change). I would say that it is a problem if the kernel is not doing something to setup the hardware correctly in any case where it is technically possible. That is clearly a bug in the driver and Qualcomm should take responsibility for fixing this poor design. Luiz is totally right here. Having said that, I don't see that bug as a blocker for a logic fix that creates an obvious (in my mind) UX improvement. Here's my view of the situation. Right now, I experience a bug on every single warm boot or module reload). After Zijun's improvement commit, I might experience this problem iif the right set of rare circumstances occurs (i.e. whenever I warm boot from an OS that puts the hardware in an unknown state, like the current mainline kernel to a kernel with the improvement). In the world where the design problem of the init/shutdown sequences are fixed AND this logic change is applied, I might never see this problem again. These seem like two different problems in my head. They don't seem directly logically related, and the blast radii of the problems are very different. Here's the facts as I see them: 1. Logic improvement problem and init/shutdown sequence problem are 2 orthogonal problems 2. Logic improvement greatly users' chances of running into the bad UX of the current code. 3. Init/shutdown isn't a trivial extension from the logic improvement. 4. Logic improvement has a pretty low cost to apply. 5. Init/shutdown sequence fix seems to be more fundamental and more intrusive. All of these together indicate to me that the logic improvement should be landed. The init/shutdown issue should also be fixed, but a fix for that issue should not block the logic improvement. My basic reasoning for this is that a visible UX fix should not e blocked for a change like the init/shutdown logic fix when the logic fix will help users. However, Qualcomm does need to feel some pressure to fix their driver code. I would like to think that user's will not be held hostage for putting pressure on a vendor in this case as the balance seems very intrusive for users. It certainly feels intrusive as a user to have to be very careful about how I reboot my laptop. >>>> see hci_uart_close() within drivers/bluetooth/hci_serdev.c >>>> static int hci_uart_close(struct hci_dev *hdev) >>>> { >>>> ...... >>>> /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver, >>>> * BT SOC is completely powered OFF during BT OFF, holding port >>>> * open may drain the battery. >>>> */ >>>> if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) { >>>> clear_bit(HCI_UART_PROTO_READY, &hu->flags); >>>> serdev_device_close(hu->serdev); >>>> } >>>> >>>> return 0; >>>> } >>>> >>>>>> + if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) || >>>>>> + hci_dev_test_flag(hdev, HCI_SETUP)) >>>>>> return; >>>>>> >>>>>> serdev_device_write_flush(serdev); >>>>>> -- >>>>>> 2.7.4 >>>>>> >>>>> >>>>> >>>> >>> >>> >> >> >> >
On 2024/5/11 04:45, Wren Turkal wrote: > On 5/7/24 6:48 AM, Lk Sii wrote: >> On 2024/5/4 05:51, quic_zijuhu wrote: >>> On 5/4/2024 5:25 AM, Luiz Augusto von Dentz wrote: >>>> Hi, >>>> >>>> On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> >>>> wrote: >>>>> >>>>> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote: >>>>>> Hi Zijun, >>>>>> >>>>>> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> >>>>>> wrote: >>>>>>> >>>>>>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on >>>>>>> closed >>>>>>> serdev") will cause below regression issue: >>>>>>> >>>>>>> BT can't be enabled after below steps: >>>>>>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable >>>>>>> failure >>>>>>> if property enable-gpios is not configured within DT|ACPI for >>>>>>> QCA6390. >>>>>>> >>>>>>> The commit is to fix a use-after-free issue within >>>>>>> qca_serdev_shutdown() >>>>>>> during reboot, but also introduces this regression issue >>>>>>> regarding above >>>>>>> steps since the VSC is not sent to reset controller during warm >>>>>>> reboot. >>>>>>> >>>>>>> Fixed by sending the VSC to reset controller within >>>>>>> qca_serdev_shutdown() >>>>>>> once BT was ever enabled, and the use-after-free issue is also be >>>>>>> fixed >>>>>>> by this change since serdev is still opened when send to serdev. >>>>>>> >>>>>>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on >>>>>>> closed serdev") >>>>>>> Cc: stable@vger.kernel.org >>>>>>> Reported-by: Wren Turkal <wt@penguintechs.org> >>>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726 >>>>>>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> >>>>>>> Tested-by: Wren Turkal <wt@penguintechs.org> >>>>>>> --- >>>>>>> drivers/bluetooth/hci_qca.c | 5 ++--- >>>>>>> 1 file changed, 2 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/bluetooth/hci_qca.c >>>>>>> b/drivers/bluetooth/hci_qca.c >>>>>>> index 0c9c9ee56592..8e35c9091486 100644 >>>>>>> --- a/drivers/bluetooth/hci_qca.c >>>>>>> +++ b/drivers/bluetooth/hci_qca.c >>>>>>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct >>>>>>> device *dev) >>>>>>> struct qca_serdev *qcadev = >>>>>>> serdev_device_get_drvdata(serdev); >>>>>>> struct hci_uart *hu = &qcadev->serdev_hu; >>>>>>> struct hci_dev *hdev = hu->hdev; >>>>>>> - struct qca_data *qca = hu->priv; >>>>>>> const u8 ibs_wake_cmd[] = { 0xFD }; >>>>>>> const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, >>>>>>> 0x05 }; >>>>>>> >>>>>>> if (qcadev->btsoc_type == QCA_QCA6390) { >>>>>>> - if (test_bit(QCA_BT_OFF, &qca->flags) || >>>>>>> - !test_bit(HCI_RUNNING, &hdev->flags)) >>>>>> >>>>>> This probably deserves a comment on why you end up with >>>>>> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you >>>>>> are removing the flags above since that was introduce to prevent >>>>>> use-after-free this sort of revert it so I do wonder how serdev can >>>>>> still be open if you haven't tested for QCA_BT_OFF for example? >>>>>> >>>>> okay, let me give comments at next version. >>>>> this design logic is shown below. you maybe review it. >>>>> >>>>> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup() >>>>> is able to be invoked by every open() to initializate SoC without any >>>>> help. so we don't need to send the VSC to reset SoC into initial and >>>>> clean state for the next hdev->setup() call success. >>>>> >>>>> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set. >>>>> >>>>> if HCI_SETUP is set, it means hdev->setup() was never be invoked, >>>>> so the >>>>> SOC is already in the initial and clean state, so we also don't >>>>> need to >>>>> send the VSC to reset SOC. >>>>> >>>>> otherwise, we need to send the VSC to reset Soc into a initial and >>>>> clean >>>>> state for hdev->setup() call success after "warm reboot -> enable BT" >>>>> >>>>> for the case commit message cares about, the only factor which >>>>> decide to >>>>> send the VSC is that SoC is a initial and clean state or not after >>>>> warm >>>>> reboot, any other factors are irrelevant to this decision. >>>>> >>>>> why the serdev is still open after go through >>>>> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) >>>>> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that >>>>> serdev is not closed by hci_uart_close(). >>>> >>>> Sounds like a logical jump to me, in fact hci_uart_close doesn't >>>> really change any of those flags, beside these flags are not really >>>> meant to tell the driver if serdev_device_close has been called or not >>>> which seems to be the intention with HCI_UART_PROTO_READY so how about >>>> we use that instead? >>>> >>> sorry for that i maybe not give good explanation, let me explain again. >>> hci_uart_close() is the only point which maybe close serdev before >>> qca_serdev_shutdown() is called, but for our case that >>> HCI_QUIRK_NON_PERSISTENT_SETUP is NOT set, hci_uart_close() will not >>> close serdev for our case, so serdev must be open state before sending >>> the VSC. so should not need other checking. >>> >> hello, i have paid attention to your discussion for a long time, i would >> like to join this discussion now. >> >> The serdev is still open before sending the VSC for this patch. >> >> are you agree with above Zijun's point? > > I will say that this part of the discussion seems to be addressing KK's > concerns. > > @KK, is this accurate? > > @Zijun, this description along with the info about the baud rate issues > should probably be part of the commit message. In these last two > messages, you have been much clearer about why this logic is needed and > correct. I wish you'd provided this description from the beginning when > KK asked for more information about the logic change. > >>>> Another thing that is troubling me is that having traffic on shutdown >>>> is not common, specially if you are going to reboot, etc, and even if >>>> it doesn't get power cycle why don't you reset on probe rather than >>>> shutdown? That way we don't have to depend on what has been done in a >>>> previous boot, which can really become a problem in case of multi-OS >>>> where you have another system that may not be doing what you expect. >>> as you know, BT UART are working at 3M baudrate for normal usage. >>> we can't distinguish if SoC expects 3M or default 11.52K baudarate >>> during probe() after reboot. so we send the VSC within shutdown to make >>> sure SoC enter a initial state with 11.52 baudrate. >>> >>> for cold boot, SOC expects default 11.52K baudrate for probe(). >>> for Enable BT -> warm boot, SOC expects 3M baudrate for probe(). >>> we can't tell these two case within probe(). so need to send the VSC >>> within shutdown(). >>> >> it seems the traffic within qca_serdev_shutdown() actually does software >> reset for BT SOC. >> >> From Zijun's points. the reasons why to do software reset within >> shutdown() instead of probe() maybe be shown below >> 1) it is impossible to do software reset within probe(). >> 2) it seems it is easier to do it within shutdown() than probe. >> >> Zijun's simple fix only change the condition and does NOT change the >> location to send the VSC, i think it maybe be other topic about location >> where(probe() or shutdown()) to do software reset. >> >> are you agree with this point? > > From a practical standpoint, this change does seem to fix the warm boot > issue on my laptop. I do not think that it would fix the issue of > booting from an OS that puts the hardware into an unknown state. > we may not need to focus on below multi-OS issue Luiz assume Other OS such as Windows -> warm boot -> boot linux OS -> enable BT failure. firstly, we don't know and can't confirm that other OS don't the similar jobs as linux kernel do, so can't confirm if the assumed issue is a real and valid issue. secondary, if other OS is supported as announced by vendor Qualcomm, their BT driver for other OS maybe have contained solution for Multi-OS relevant concerns. >> For concern about multi-OS, i would like to show my points. >> >> the patch is for linux kernel, we maybe only need to care about linux >> OS. it maybe out-of-scope to make assumptions about other OSs vendor >> announced supported such as Windows OS we don't known much about. >> >> are you agree with this point? > > I would not fully agree with this point. However, I would agree > completely with your proposal for moving forward (i.e. landing the change). > i agree with above Wren's point about moving forward. > I would say that it is a problem if the kernel is not doing something to > setup the hardware correctly in any case where it is technically > possible. That is clearly a bug in the driver and Qualcomm should take > responsibility for fixing this poor design. Luiz is totally right here. >i don't agree with your above point that it is a bug and driver's poor design, i don't actually understand how|who to conclude that present design is clearly a bug and is poor design. it seems this design(software reset within shutdown()) is dedicated for the scenario that HOST doesn't have H/W resource to reset SOC if you noticed below link https://patchwork.kernel.org/project/bluetooth/patch/1713947712-4307-1-git-send-email-quic_zijuhu@quicinc.com/ For the scenario, perhaps, vendor Qualcomm have had more considerations than you can image and finally selected this doable design. as i commented previously. this patch doesn't touch software reset location designed. so it maybe be out-of-scope to discuss why probe() does not do the job. > Having said that, I don't see that bug as a blocker for a logic fix that > creates an obvious (in my mind) UX improvement. > > Here's my view of the situation. Right now, I experience a bug on every > single warm boot or module reload). > > After Zijun's improvement commit, I might experience this problem iif > the right set of rare circumstances occurs (i.e. whenever I warm boot > from an OS that puts the hardware in an unknown state, like the current > mainline kernel to a kernel with the improvement). > > In the world where the design problem of the init/shutdown sequences are > fixed AND this logic change is applied, I might never see this problem > again. > > These seem like two different problems in my head. They don't seem > directly logically related, and the blast radii of the problems are very > different. > > Here's the facts as I see them: > 1. Logic improvement problem and init/shutdown sequence problem are 2 > orthogonal problems > 2. Logic improvement greatly users' chances of running into the bad UX > of the current code. > 3. Init/shutdown isn't a trivial extension from the logic improvement. > 4. Logic improvement has a pretty low cost to apply. > 5. Init/shutdown sequence fix seems to be more fundamental and more > intrusive. > > All of these together indicate to me that the logic improvement should > be landed. The init/shutdown issue should also be fixed, but a fix for > that issue should not block the logic improvement. > > My basic reasoning for this is that a visible UX fix should not e > blocked for a change like the init/shutdown logic fix when the logic fix > will help users. > > However, Qualcomm does need to feel some pressure to fix their driver > code. I would like to think that user's will not be held hostage for > putting pressure on a vendor in this case as the balance seems very > intrusive for users. It certainly feels intrusive as a user to have to > be very careful about how I reboot my laptop. > >>>>> see hci_uart_close() within drivers/bluetooth/hci_serdev.c >>>>> static int hci_uart_close(struct hci_dev *hdev) >>>>> { >>>>> ...... >>>>> /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by >>>>> driver, >>>>> * BT SOC is completely powered OFF during BT OFF, holding >>>>> port >>>>> * open may drain the battery. >>>>> */ >>>>> if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, >>>>> &hdev->quirks)) { >>>>> clear_bit(HCI_UART_PROTO_READY, &hu->flags); >>>>> serdev_device_close(hu->serdev); >>>>> } >>>>> >>>>> return 0; >>>>> } >>>>> >>>>>>> + if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, >>>>>>> &hdev->quirks) || >>>>>>> + hci_dev_test_flag(hdev, HCI_SETUP)) >>>>>>> return; >>>>>>> >>>>>>> serdev_device_write_flush(serdev); >>>>>>> -- >>>>>>> 2.7.4 >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> >>> >> >
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c index 0c9c9ee56592..8e35c9091486 100644 --- a/drivers/bluetooth/hci_qca.c +++ b/drivers/bluetooth/hci_qca.c @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev) struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); struct hci_uart *hu = &qcadev->serdev_hu; struct hci_dev *hdev = hu->hdev; - struct qca_data *qca = hu->priv; const u8 ibs_wake_cmd[] = { 0xFD }; const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; if (qcadev->btsoc_type == QCA_QCA6390) { - if (test_bit(QCA_BT_OFF, &qca->flags) || - !test_bit(HCI_RUNNING, &hdev->flags)) + if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) || + hci_dev_test_flag(hdev, HCI_SETUP)) return; serdev_device_write_flush(serdev);