diff mbox series

[v1] Bluetooth: qca: Fix BT enable failure again for QCA6390 after warm reboot

Message ID 1714658761-15326-1-git-send-email-quic_zijuhu@quicinc.com (mailing list archive)
State Superseded
Headers show
Series [v1] Bluetooth: qca: Fix BT enable failure again for QCA6390 after warm reboot | expand

Checks

Context Check Description
tedd_an/pre-ci_am success Success
tedd_an/CheckPatch success CheckPatch PASS
tedd_an/GitLint success Gitlint PASS
tedd_an/SubjectPrefix success Gitlint PASS
tedd_an/BuildKernel success BuildKernel PASS
tedd_an/CheckAllWarning success CheckAllWarning PASS
tedd_an/CheckSparse success CheckSparse PASS
tedd_an/CheckSmatch fail CheckSparse: FAIL: Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: net/bluetooth/hci_core.o] Error 139 make[4]: *** Deleting file 'net/bluetooth/hci_core.o' make[3]: *** [scripts/Makefile.build:485: net/bluetooth] Error 2 make[2]: *** [scripts/Makefile.build:485: net] Error 2 make[2]: *** Waiting for unfinished jobs.... Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bcm203x.o] Error 139 make[4]: *** Deleting file 'drivers/bluetooth/bcm203x.o' make[4]: *** Waiting for unfinished jobs.... Segmentation fault (core dumped) make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bpa10x.o] Error 139 make[4]: *** Deleting file 'drivers/bluetooth/bpa10x.o' make[3]: *** [scripts/Makefile.build:485: drivers/bluetooth] Error 2 make[2]: *** [scripts/Makefile.build:485: drivers] Error 2 make[1]: *** [/github/workspace/src/src/Makefile:1919: .] Error 2 make: *** [Makefile:240: __sub-make] Error 2
tedd_an/BuildKernel32 success BuildKernel32 PASS
tedd_an/TestRunnerSetup success TestRunnerSetup PASS
tedd_an/TestRunner_l2cap-tester success TestRunner PASS
tedd_an/TestRunner_iso-tester success TestRunner PASS
tedd_an/TestRunner_bnep-tester success TestRunner PASS
tedd_an/TestRunner_mgmt-tester success TestRunner PASS
tedd_an/TestRunner_rfcomm-tester success TestRunner PASS
tedd_an/TestRunner_sco-tester success TestRunner PASS
tedd_an/TestRunner_ioctl-tester success TestRunner PASS
tedd_an/TestRunner_mesh-tester success TestRunner PASS
tedd_an/TestRunner_smp-tester success TestRunner PASS
tedd_an/TestRunner_userchan-tester success TestRunner PASS
tedd_an/IncrementalBuild success Incremental Build PASS

Commit Message

quic_zijuhu May 2, 2024, 2:06 p.m. UTC
Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
serdev") will cause below regression issue:

BT can't be enabled after below steps:
cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
if property enable-gpios is not configured within DT|ACPI for QCA6390.

The commit is to fix a use-after-free issue within qca_serdev_shutdown()
during reboot, but also introduces this regression issue regarding above
steps since the VSC is not sent to reset controller during warm reboot.

Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
once BT was ever enabled, and the use-after-free issue is also be fixed
by this change since serdev is still opened when send to serdev.

Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
Cc: stable@vger.kernel.org
Reported-by: Wren Turkal <wt@penguintechs.org>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Tested-by: Wren Turkal <wt@penguintechs.org>
---
 drivers/bluetooth/hci_qca.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

bluez.test.bot@gmail.com May 2, 2024, 2:32 p.m. UTC | #1
This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=849912

---Test result---

Test Summary:
CheckPatch                    PASS      0.49 seconds
GitLint                       PASS      0.19 seconds
SubjectPrefix                 PASS      0.06 seconds
BuildKernel                   PASS      30.95 seconds
CheckAllWarning               PASS      34.14 seconds
CheckSparse                   PASS      43.14 seconds
CheckSmatch                   FAIL      37.86 seconds
BuildKernel32                 PASS      29.85 seconds
TestRunnerSetup               PASS      546.06 seconds
TestRunner_l2cap-tester       PASS      18.60 seconds
TestRunner_iso-tester         PASS      31.83 seconds
TestRunner_bnep-tester        PASS      4.77 seconds
TestRunner_mgmt-tester        PASS      113.61 seconds
TestRunner_rfcomm-tester      PASS      7.43 seconds
TestRunner_sco-tester         PASS      15.15 seconds
TestRunner_ioctl-tester       PASS      7.94 seconds
TestRunner_mesh-tester        PASS      5.89 seconds
TestRunner_smp-tester         PASS      6.74 seconds
TestRunner_userchan-tester    PASS      4.87 seconds
IncrementalBuild              PASS      28.61 seconds

Details
##############################
Test: CheckSmatch - FAIL
Desc: Run smatch tool with source
Output:

Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: net/bluetooth/hci_core.o] Error 139
make[4]: *** Deleting file 'net/bluetooth/hci_core.o'
make[3]: *** [scripts/Makefile.build:485: net/bluetooth] Error 2
make[2]: *** [scripts/Makefile.build:485: net] Error 2
make[2]: *** Waiting for unfinished jobs....
Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bcm203x.o] Error 139
make[4]: *** Deleting file 'drivers/bluetooth/bcm203x.o'
make[4]: *** Waiting for unfinished jobs....
Segmentation fault (core dumped)
make[4]: *** [scripts/Makefile.build:244: drivers/bluetooth/bpa10x.o] Error 139
make[4]: *** Deleting file 'drivers/bluetooth/bpa10x.o'
make[3]: *** [scripts/Makefile.build:485: drivers/bluetooth] Error 2
make[2]: *** [scripts/Makefile.build:485: drivers] Error 2
make[1]: *** [/github/workspace/src/src/Makefile:1919: .] Error 2
make: *** [Makefile:240: __sub-make] Error 2


---
Regards,
Linux Bluetooth
Krzysztof Kozlowski May 3, 2024, 10:16 a.m. UTC | #2
On 02/05/2024 16:06, Zijun Hu wrote:
> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
> serdev") will cause below regression issue:
> 
> BT can't be enabled after below steps:
> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
> if property enable-gpios is not configured within DT|ACPI for QCA6390.
> 
> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
> during reboot, but also introduces this regression issue regarding above
> steps since the VSC is not sent to reset controller during warm reboot.
> 
> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
> once BT was ever enabled, and the use-after-free issue is also be fixed
> by this change since serdev is still opened when send to serdev.
> 
> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
> Cc: stable@vger.kernel.org
> Reported-by: Wren Turkal <wt@penguintechs.org>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
> Tested-by: Wren Turkal <wt@penguintechs.org>
> ---
>  drivers/bluetooth/hci_qca.c | 5 ++---

I don't think this is v1. Version your patches properly and provide
changelog.

I asked already *two times*:
1. On which kernel did you test it?
2. On which hardware did you test it?

I am not interested in any replies like "I wrote something on bugzilla".
I am really fed up with your elusive, time-wasting replies, so be
specific here.

Best regards,
Krzysztof
quic_zijuhu May 3, 2024, 6:49 p.m. UTC | #3
On 5/3/2024 6:16 PM, Krzysztof Kozlowski wrote:
> On 02/05/2024 16:06, Zijun Hu wrote:
>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
>> serdev") will cause below regression issue:
>>
>> BT can't be enabled after below steps:
>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
>> if property enable-gpios is not configured within DT|ACPI for QCA6390.
>>
>> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
>> during reboot, but also introduces this regression issue regarding above
>> steps since the VSC is not sent to reset controller during warm reboot.
>>
>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
>> once BT was ever enabled, and the use-after-free issue is also be fixed
>> by this change since serdev is still opened when send to serdev.
>>
>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
>> Cc: stable@vger.kernel.org
>> Reported-by: Wren Turkal <wt@penguintechs.org>
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
>> Tested-by: Wren Turkal <wt@penguintechs.org>
>> ---
>>  drivers/bluetooth/hci_qca.c | 5 ++---
> 
> I don't think this is v1. Version your patches properly and provide
> changelog.
> 
i sent it as v1 to start a new and clean discussion.
> I asked already *two times*:
> 1. On which kernel did you test it?
> 2. On which hardware did you test it?
> 
will provide such info within next commit message.
> I am not interested in any replies like "I wrote something on bugzilla".
> I am really fed up with your elusive, time-wasting replies, so be
> specific here.
> 
are there any other concerns about this patch itself?
> Best regards,
> Krzysztof
>
Luiz Augusto von Dentz May 3, 2024, 7:22 p.m. UTC | #4
Hi Zijun,

On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote:
>
> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
> serdev") will cause below regression issue:
>
> BT can't be enabled after below steps:
> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
> if property enable-gpios is not configured within DT|ACPI for QCA6390.
>
> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
> during reboot, but also introduces this regression issue regarding above
> steps since the VSC is not sent to reset controller during warm reboot.
>
> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
> once BT was ever enabled, and the use-after-free issue is also be fixed
> by this change since serdev is still opened when send to serdev.
>
> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
> Cc: stable@vger.kernel.org
> Reported-by: Wren Turkal <wt@penguintechs.org>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
> Tested-by: Wren Turkal <wt@penguintechs.org>
> ---
>  drivers/bluetooth/hci_qca.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> index 0c9c9ee56592..8e35c9091486 100644
> --- a/drivers/bluetooth/hci_qca.c
> +++ b/drivers/bluetooth/hci_qca.c
> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev)
>         struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
>         struct hci_uart *hu = &qcadev->serdev_hu;
>         struct hci_dev *hdev = hu->hdev;
> -       struct qca_data *qca = hu->priv;
>         const u8 ibs_wake_cmd[] = { 0xFD };
>         const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
>
>         if (qcadev->btsoc_type == QCA_QCA6390) {
> -               if (test_bit(QCA_BT_OFF, &qca->flags) ||
> -                   !test_bit(HCI_RUNNING, &hdev->flags))

This probably deserves a comment on why you end up with
HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you
are removing the flags above since that was introduce to prevent
use-after-free this sort of revert it so I do wonder how serdev can
still be open if you haven't tested for QCA_BT_OFF for example?

> +               if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) ||
> +                   hci_dev_test_flag(hdev, HCI_SETUP))
>                         return;
>
>                 serdev_device_write_flush(serdev);
> --
> 2.7.4
>
quic_zijuhu May 3, 2024, 8:18 p.m. UTC | #5
On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote:
> Hi Zijun,
> 
> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote:
>>
>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
>> serdev") will cause below regression issue:
>>
>> BT can't be enabled after below steps:
>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
>> if property enable-gpios is not configured within DT|ACPI for QCA6390.
>>
>> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
>> during reboot, but also introduces this regression issue regarding above
>> steps since the VSC is not sent to reset controller during warm reboot.
>>
>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
>> once BT was ever enabled, and the use-after-free issue is also be fixed
>> by this change since serdev is still opened when send to serdev.
>>
>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
>> Cc: stable@vger.kernel.org
>> Reported-by: Wren Turkal <wt@penguintechs.org>
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
>> Tested-by: Wren Turkal <wt@penguintechs.org>
>> ---
>>  drivers/bluetooth/hci_qca.c | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
>> index 0c9c9ee56592..8e35c9091486 100644
>> --- a/drivers/bluetooth/hci_qca.c
>> +++ b/drivers/bluetooth/hci_qca.c
>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev)
>>         struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
>>         struct hci_uart *hu = &qcadev->serdev_hu;
>>         struct hci_dev *hdev = hu->hdev;
>> -       struct qca_data *qca = hu->priv;
>>         const u8 ibs_wake_cmd[] = { 0xFD };
>>         const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
>>
>>         if (qcadev->btsoc_type == QCA_QCA6390) {
>> -               if (test_bit(QCA_BT_OFF, &qca->flags) ||
>> -                   !test_bit(HCI_RUNNING, &hdev->flags))
> 
> This probably deserves a comment on why you end up with
> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you
> are removing the flags above since that was introduce to prevent
> use-after-free this sort of revert it so I do wonder how serdev can
> still be open if you haven't tested for QCA_BT_OFF for example?
> 
okay, let me give comments at next version.
this design logic is shown below. you maybe review it.

if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup()
is able to be invoked by every open() to initializate SoC without any
help. so we don't need to send the VSC to reset SoC into initial and
clean state for the next hdev->setup() call success.

otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set.

if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the
SOC is already in the initial and clean state, so we also don't need to
send the VSC to reset SOC.

otherwise, we need to send the VSC to reset Soc into a initial and clean
state for hdev->setup() call success after "warm reboot -> enable BT"

for the case commit message cares about, the only factor which decide to
send the VSC is that SoC is a initial and clean state or not after warm
reboot, any other factors are irrelevant to this decision.

why the serdev is still open after go through
(test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)
|| hci_dev_test_flag(hdev, HCI_SETUP) checking is that
serdev is not closed by hci_uart_close().

see hci_uart_close() within drivers/bluetooth/hci_serdev.c
static int hci_uart_close(struct hci_dev *hdev)
{
......
	/* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver,
	 * BT SOC is completely powered OFF during BT OFF, holding port
	 * open may drain the battery.
	 */
	if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
		clear_bit(HCI_UART_PROTO_READY, &hu->flags);
		serdev_device_close(hu->serdev);
	}

	return 0;
}

>> +               if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) ||
>> +                   hci_dev_test_flag(hdev, HCI_SETUP))
>>                         return;
>>
>>                 serdev_device_write_flush(serdev);
>> --
>> 2.7.4
>>
> 
>
Luiz Augusto von Dentz May 3, 2024, 9:25 p.m. UTC | #6
Hi,

On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> wrote:
>
> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote:
> > Hi Zijun,
> >
> > On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote:
> >>
> >> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
> >> serdev") will cause below regression issue:
> >>
> >> BT can't be enabled after below steps:
> >> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
> >> if property enable-gpios is not configured within DT|ACPI for QCA6390.
> >>
> >> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
> >> during reboot, but also introduces this regression issue regarding above
> >> steps since the VSC is not sent to reset controller during warm reboot.
> >>
> >> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
> >> once BT was ever enabled, and the use-after-free issue is also be fixed
> >> by this change since serdev is still opened when send to serdev.
> >>
> >> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
> >> Cc: stable@vger.kernel.org
> >> Reported-by: Wren Turkal <wt@penguintechs.org>
> >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
> >> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
> >> Tested-by: Wren Turkal <wt@penguintechs.org>
> >> ---
> >>  drivers/bluetooth/hci_qca.c | 5 ++---
> >>  1 file changed, 2 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> >> index 0c9c9ee56592..8e35c9091486 100644
> >> --- a/drivers/bluetooth/hci_qca.c
> >> +++ b/drivers/bluetooth/hci_qca.c
> >> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev)
> >>         struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
> >>         struct hci_uart *hu = &qcadev->serdev_hu;
> >>         struct hci_dev *hdev = hu->hdev;
> >> -       struct qca_data *qca = hu->priv;
> >>         const u8 ibs_wake_cmd[] = { 0xFD };
> >>         const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
> >>
> >>         if (qcadev->btsoc_type == QCA_QCA6390) {
> >> -               if (test_bit(QCA_BT_OFF, &qca->flags) ||
> >> -                   !test_bit(HCI_RUNNING, &hdev->flags))
> >
> > This probably deserves a comment on why you end up with
> > HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you
> > are removing the flags above since that was introduce to prevent
> > use-after-free this sort of revert it so I do wonder how serdev can
> > still be open if you haven't tested for QCA_BT_OFF for example?
> >
> okay, let me give comments at next version.
> this design logic is shown below. you maybe review it.
>
> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup()
> is able to be invoked by every open() to initializate SoC without any
> help. so we don't need to send the VSC to reset SoC into initial and
> clean state for the next hdev->setup() call success.
>
> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set.
>
> if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the
> SOC is already in the initial and clean state, so we also don't need to
> send the VSC to reset SOC.
>
> otherwise, we need to send the VSC to reset Soc into a initial and clean
> state for hdev->setup() call success after "warm reboot -> enable BT"
>
> for the case commit message cares about, the only factor which decide to
> send the VSC is that SoC is a initial and clean state or not after warm
> reboot, any other factors are irrelevant to this decision.
>
> why the serdev is still open after go through
> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)
> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that
> serdev is not closed by hci_uart_close().

Sounds like a logical jump to me, in fact hci_uart_close doesn't
really change any of those flags, beside these flags are not really
meant to tell the driver if serdev_device_close has been called or not
which seems to be the intention with HCI_UART_PROTO_READY so how about
we use that instead?

Another thing that is troubling me is that having traffic on shutdown
is not common, specially if you are going to reboot, etc, and even if
it doesn't get power cycle why don't you reset on probe rather than
shutdown? That way we don't have to depend on what has been done in a
previous boot, which can really become a problem in case of multi-OS
where you have another system that may not be doing what you expect.

> see hci_uart_close() within drivers/bluetooth/hci_serdev.c
> static int hci_uart_close(struct hci_dev *hdev)
> {
> ......
>         /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver,
>          * BT SOC is completely powered OFF during BT OFF, holding port
>          * open may drain the battery.
>          */
>         if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
>                 clear_bit(HCI_UART_PROTO_READY, &hu->flags);
>                 serdev_device_close(hu->serdev);
>         }
>
>         return 0;
> }
>
> >> +               if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) ||
> >> +                   hci_dev_test_flag(hdev, HCI_SETUP))
> >>                         return;
> >>
> >>                 serdev_device_write_flush(serdev);
> >> --
> >> 2.7.4
> >>
> >
> >
>
quic_zijuhu May 3, 2024, 9:51 p.m. UTC | #7
On 5/4/2024 5:25 AM, Luiz Augusto von Dentz wrote:
> Hi,
> 
> On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> wrote:
>>
>> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote:
>>> Hi Zijun,
>>>
>>> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote:
>>>>
>>>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
>>>> serdev") will cause below regression issue:
>>>>
>>>> BT can't be enabled after below steps:
>>>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
>>>> if property enable-gpios is not configured within DT|ACPI for QCA6390.
>>>>
>>>> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
>>>> during reboot, but also introduces this regression issue regarding above
>>>> steps since the VSC is not sent to reset controller during warm reboot.
>>>>
>>>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
>>>> once BT was ever enabled, and the use-after-free issue is also be fixed
>>>> by this change since serdev is still opened when send to serdev.
>>>>
>>>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
>>>> Cc: stable@vger.kernel.org
>>>> Reported-by: Wren Turkal <wt@penguintechs.org>
>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
>>>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
>>>> Tested-by: Wren Turkal <wt@penguintechs.org>
>>>> ---
>>>>  drivers/bluetooth/hci_qca.c | 5 ++---
>>>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
>>>> index 0c9c9ee56592..8e35c9091486 100644
>>>> --- a/drivers/bluetooth/hci_qca.c
>>>> +++ b/drivers/bluetooth/hci_qca.c
>>>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev)
>>>>         struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
>>>>         struct hci_uart *hu = &qcadev->serdev_hu;
>>>>         struct hci_dev *hdev = hu->hdev;
>>>> -       struct qca_data *qca = hu->priv;
>>>>         const u8 ibs_wake_cmd[] = { 0xFD };
>>>>         const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
>>>>
>>>>         if (qcadev->btsoc_type == QCA_QCA6390) {
>>>> -               if (test_bit(QCA_BT_OFF, &qca->flags) ||
>>>> -                   !test_bit(HCI_RUNNING, &hdev->flags))
>>>
>>> This probably deserves a comment on why you end up with
>>> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you
>>> are removing the flags above since that was introduce to prevent
>>> use-after-free this sort of revert it so I do wonder how serdev can
>>> still be open if you haven't tested for QCA_BT_OFF for example?
>>>
>> okay, let me give comments at next version.
>> this design logic is shown below. you maybe review it.
>>
>> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup()
>> is able to be invoked by every open() to initializate SoC without any
>> help. so we don't need to send the VSC to reset SoC into initial and
>> clean state for the next hdev->setup() call success.
>>
>> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set.
>>
>> if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the
>> SOC is already in the initial and clean state, so we also don't need to
>> send the VSC to reset SOC.
>>
>> otherwise, we need to send the VSC to reset Soc into a initial and clean
>> state for hdev->setup() call success after "warm reboot -> enable BT"
>>
>> for the case commit message cares about, the only factor which decide to
>> send the VSC is that SoC is a initial and clean state or not after warm
>> reboot, any other factors are irrelevant to this decision.
>>
>> why the serdev is still open after go through
>> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)
>> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that
>> serdev is not closed by hci_uart_close().
> 
> Sounds like a logical jump to me, in fact hci_uart_close doesn't
> really change any of those flags, beside these flags are not really
> meant to tell the driver if serdev_device_close has been called or not
> which seems to be the intention with HCI_UART_PROTO_READY so how about
> we use that instead?
> 
sorry for that i maybe not give good explanation, let me explain again.
hci_uart_close() is the only point which maybe close serdev before
qca_serdev_shutdown() is called, but for our case that
HCI_QUIRK_NON_PERSISTENT_SETUP is NOT set, hci_uart_close() will not
close serdev for our case, so serdev must be open state before sending
the VSC. so should not need other checking.

> Another thing that is troubling me is that having traffic on shutdown
> is not common, specially if you are going to reboot, etc, and even if
> it doesn't get power cycle why don't you reset on probe rather than
> shutdown? That way we don't have to depend on what has been done in a
> previous boot, which can really become a problem in case of multi-OS
> where you have another system that may not be doing what you expect.
as you know, BT UART are working at 3M baudrate for normal usage.
we can't distinguish if SoC expects 3M or default 11.52K baudarate
during probe() after reboot. so we send the VSC within shutdown to make
sure SoC enter a initial state with 11.52 baudrate.

for cold boot, SOC expects default 11.52K baudrate for probe().
for Enable BT -> warm boot, SOC expects 3M baudrate for probe().
we can't tell these two case within probe(). so need to send the VSC
within shutdown().

>> see hci_uart_close() within drivers/bluetooth/hci_serdev.c
>> static int hci_uart_close(struct hci_dev *hdev)
>> {
>> ......
>>         /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver,
>>          * BT SOC is completely powered OFF during BT OFF, holding port
>>          * open may drain the battery.
>>          */
>>         if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
>>                 clear_bit(HCI_UART_PROTO_READY, &hu->flags);
>>                 serdev_device_close(hu->serdev);
>>         }
>>
>>         return 0;
>> }
>>
>>>> +               if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) ||
>>>> +                   hci_dev_test_flag(hdev, HCI_SETUP))
>>>>                         return;
>>>>
>>>>                 serdev_device_write_flush(serdev);
>>>> --
>>>> 2.7.4
>>>>
>>>
>>>
>>
> 
>
Lk Sii May 7, 2024, 1:48 p.m. UTC | #8
On 2024/5/4 05:51, quic_zijuhu wrote:
> On 5/4/2024 5:25 AM, Luiz Augusto von Dentz wrote:
>> Hi,
>>
>> On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> wrote:
>>>
>>> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote:
>>>> Hi Zijun,
>>>>
>>>> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote:
>>>>>
>>>>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
>>>>> serdev") will cause below regression issue:
>>>>>
>>>>> BT can't be enabled after below steps:
>>>>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
>>>>> if property enable-gpios is not configured within DT|ACPI for QCA6390.
>>>>>
>>>>> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
>>>>> during reboot, but also introduces this regression issue regarding above
>>>>> steps since the VSC is not sent to reset controller during warm reboot.
>>>>>
>>>>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
>>>>> once BT was ever enabled, and the use-after-free issue is also be fixed
>>>>> by this change since serdev is still opened when send to serdev.
>>>>>
>>>>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
>>>>> Cc: stable@vger.kernel.org
>>>>> Reported-by: Wren Turkal <wt@penguintechs.org>
>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
>>>>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
>>>>> Tested-by: Wren Turkal <wt@penguintechs.org>
>>>>> ---
>>>>>  drivers/bluetooth/hci_qca.c | 5 ++---
>>>>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
>>>>> index 0c9c9ee56592..8e35c9091486 100644
>>>>> --- a/drivers/bluetooth/hci_qca.c
>>>>> +++ b/drivers/bluetooth/hci_qca.c
>>>>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev)
>>>>>         struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
>>>>>         struct hci_uart *hu = &qcadev->serdev_hu;
>>>>>         struct hci_dev *hdev = hu->hdev;
>>>>> -       struct qca_data *qca = hu->priv;
>>>>>         const u8 ibs_wake_cmd[] = { 0xFD };
>>>>>         const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
>>>>>
>>>>>         if (qcadev->btsoc_type == QCA_QCA6390) {
>>>>> -               if (test_bit(QCA_BT_OFF, &qca->flags) ||
>>>>> -                   !test_bit(HCI_RUNNING, &hdev->flags))
>>>>
>>>> This probably deserves a comment on why you end up with
>>>> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you
>>>> are removing the flags above since that was introduce to prevent
>>>> use-after-free this sort of revert it so I do wonder how serdev can
>>>> still be open if you haven't tested for QCA_BT_OFF for example?
>>>>
>>> okay, let me give comments at next version.
>>> this design logic is shown below. you maybe review it.
>>>
>>> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup()
>>> is able to be invoked by every open() to initializate SoC without any
>>> help. so we don't need to send the VSC to reset SoC into initial and
>>> clean state for the next hdev->setup() call success.
>>>
>>> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set.
>>>
>>> if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the
>>> SOC is already in the initial and clean state, so we also don't need to
>>> send the VSC to reset SOC.
>>>
>>> otherwise, we need to send the VSC to reset Soc into a initial and clean
>>> state for hdev->setup() call success after "warm reboot -> enable BT"
>>>
>>> for the case commit message cares about, the only factor which decide to
>>> send the VSC is that SoC is a initial and clean state or not after warm
>>> reboot, any other factors are irrelevant to this decision.
>>>
>>> why the serdev is still open after go through
>>> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)
>>> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that
>>> serdev is not closed by hci_uart_close().
>>
>> Sounds like a logical jump to me, in fact hci_uart_close doesn't
>> really change any of those flags, beside these flags are not really
>> meant to tell the driver if serdev_device_close has been called or not
>> which seems to be the intention with HCI_UART_PROTO_READY so how about
>> we use that instead?
>>
> sorry for that i maybe not give good explanation, let me explain again.
> hci_uart_close() is the only point which maybe close serdev before
> qca_serdev_shutdown() is called, but for our case that
> HCI_QUIRK_NON_PERSISTENT_SETUP is NOT set, hci_uart_close() will not
> close serdev for our case, so serdev must be open state before sending
> the VSC. so should not need other checking.
>
hello, i have paid attention to your discussion for a long time, i would
like to join this discussion now.

The serdev is still open before sending the VSC for this patch.

are you agree with above Zijun's point?

>> Another thing that is troubling me is that having traffic on shutdown
>> is not common, specially if you are going to reboot, etc, and even if
>> it doesn't get power cycle why don't you reset on probe rather than
>> shutdown? That way we don't have to depend on what has been done in a
>> previous boot, which can really become a problem in case of multi-OS
>> where you have another system that may not be doing what you expect.
> as you know, BT UART are working at 3M baudrate for normal usage.
> we can't distinguish if SoC expects 3M or default 11.52K baudarate
> during probe() after reboot. so we send the VSC within shutdown to make
> sure SoC enter a initial state with 11.52 baudrate.
> 
> for cold boot, SOC expects default 11.52K baudrate for probe().
> for Enable BT -> warm boot, SOC expects 3M baudrate for probe().
> we can't tell these two case within probe(). so need to send the VSC
> within shutdown().
>
it seems the traffic within qca_serdev_shutdown() actually does software
reset for BT SOC.

From Zijun's points. the reasons why to do software reset within
shutdown() instead of probe() maybe be shown below
1) it is impossible to do software reset within probe().
2) it seems it is easier to do it within shutdown() than probe.

Zijun's simple fix only change the condition and does NOT change the
location to send the VSC, i think it maybe be other topic about location
where(probe() or shutdown()) to do software reset.

are you agree with this point?


For concern about multi-OS, i would like to show my points.

the patch is for linux kernel, we maybe only need to care about linux
OS. it maybe out-of-scope to make assumptions about other OSs vendor
announced supported such as Windows OS we don't known much about.

are you agree with this point?

>>> see hci_uart_close() within drivers/bluetooth/hci_serdev.c
>>> static int hci_uart_close(struct hci_dev *hdev)
>>> {
>>> ......
>>>         /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver,
>>>          * BT SOC is completely powered OFF during BT OFF, holding port
>>>          * open may drain the battery.
>>>          */
>>>         if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
>>>                 clear_bit(HCI_UART_PROTO_READY, &hu->flags);
>>>                 serdev_device_close(hu->serdev);
>>>         }
>>>
>>>         return 0;
>>> }
>>>
>>>>> +               if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) ||
>>>>> +                   hci_dev_test_flag(hdev, HCI_SETUP))
>>>>>                         return;
>>>>>
>>>>>                 serdev_device_write_flush(serdev);
>>>>> --
>>>>> 2.7.4
>>>>>
>>>>
>>>>
>>>
>>
>>
> 
> 
>
Wren Turkal May 10, 2024, 7:49 p.m. UTC | #9
On 5/3/24 3:16 AM, Krzysztof Kozlowski wrote:
> On 02/05/2024 16:06, Zijun Hu wrote:
>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
>> serdev") will cause below regression issue:
>>
>> BT can't be enabled after below steps:
>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
>> if property enable-gpios is not configured within DT|ACPI for QCA6390.
>>
>> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
>> during reboot, but also introduces this regression issue regarding above
>> steps since the VSC is not sent to reset controller during warm reboot.
>>
>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
>> once BT was ever enabled, and the use-after-free issue is also be fixed
>> by this change since serdev is still opened when send to serdev.
>>
>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
>> Cc: stable@vger.kernel.org
>> Reported-by: Wren Turkal <wt@penguintechs.org>
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
>> Tested-by: Wren Turkal <wt@penguintechs.org>
>> ---
>>   drivers/bluetooth/hci_qca.c | 5 ++---
> 
> I don't think this is v1. Version your patches properly and provide
> changelog.
> 
> I asked already *two times*:
> 1. On which kernel did you test it?
> 2. On which hardware did you test it?

I thought I had already chimed in with this information. I am using a 
Dell XPS 13 9310. It's the only hardware I have access to. I can say 
that the fix seems to work as advertised in that it fixes the warm boot 
issue I have been experiencing.

As a user, I vastly prefer a one time cold boot rather than having to do 
it every time. As such, I can do the test. However, I do not think that 
the patch not working in that case should block landing logic that 
empirically improved my experience with the hardware and minimizes the 
corner case to a much smaller corner (that of warm booting from an OS 
that improperly puts the device in a bad state to a Linux kernel with 
this UX improvement).

wt
Wren Turkal May 10, 2024, 8:45 p.m. UTC | #10
On 5/7/24 6:48 AM, Lk Sii wrote:
> On 2024/5/4 05:51, quic_zijuhu wrote:
>> On 5/4/2024 5:25 AM, Luiz Augusto von Dentz wrote:
>>> Hi,
>>>
>>> On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com> wrote:
>>>>
>>>> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote:
>>>>> Hi Zijun,
>>>>>
>>>>> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com> wrote:
>>>>>>
>>>>>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
>>>>>> serdev") will cause below regression issue:
>>>>>>
>>>>>> BT can't be enabled after below steps:
>>>>>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
>>>>>> if property enable-gpios is not configured within DT|ACPI for QCA6390.
>>>>>>
>>>>>> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
>>>>>> during reboot, but also introduces this regression issue regarding above
>>>>>> steps since the VSC is not sent to reset controller during warm reboot.
>>>>>>
>>>>>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
>>>>>> once BT was ever enabled, and the use-after-free issue is also be fixed
>>>>>> by this change since serdev is still opened when send to serdev.
>>>>>>
>>>>>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
>>>>>> Cc: stable@vger.kernel.org
>>>>>> Reported-by: Wren Turkal <wt@penguintechs.org>
>>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
>>>>>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
>>>>>> Tested-by: Wren Turkal <wt@penguintechs.org>
>>>>>> ---
>>>>>>   drivers/bluetooth/hci_qca.c | 5 ++---
>>>>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
>>>>>> index 0c9c9ee56592..8e35c9091486 100644
>>>>>> --- a/drivers/bluetooth/hci_qca.c
>>>>>> +++ b/drivers/bluetooth/hci_qca.c
>>>>>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct device *dev)
>>>>>>          struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
>>>>>>          struct hci_uart *hu = &qcadev->serdev_hu;
>>>>>>          struct hci_dev *hdev = hu->hdev;
>>>>>> -       struct qca_data *qca = hu->priv;
>>>>>>          const u8 ibs_wake_cmd[] = { 0xFD };
>>>>>>          const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
>>>>>>
>>>>>>          if (qcadev->btsoc_type == QCA_QCA6390) {
>>>>>> -               if (test_bit(QCA_BT_OFF, &qca->flags) ||
>>>>>> -                   !test_bit(HCI_RUNNING, &hdev->flags))
>>>>>
>>>>> This probably deserves a comment on why you end up with
>>>>> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you
>>>>> are removing the flags above since that was introduce to prevent
>>>>> use-after-free this sort of revert it so I do wonder how serdev can
>>>>> still be open if you haven't tested for QCA_BT_OFF for example?
>>>>>
>>>> okay, let me give comments at next version.
>>>> this design logic is shown below. you maybe review it.
>>>>
>>>> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup()
>>>> is able to be invoked by every open() to initializate SoC without any
>>>> help. so we don't need to send the VSC to reset SoC into initial and
>>>> clean state for the next hdev->setup() call success.
>>>>
>>>> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set.
>>>>
>>>> if HCI_SETUP is set, it means hdev->setup() was never be invoked, so the
>>>> SOC is already in the initial and clean state, so we also don't need to
>>>> send the VSC to reset SOC.
>>>>
>>>> otherwise, we need to send the VSC to reset Soc into a initial and clean
>>>> state for hdev->setup() call success after "warm reboot -> enable BT"
>>>>
>>>> for the case commit message cares about, the only factor which decide to
>>>> send the VSC is that SoC is a initial and clean state or not after warm
>>>> reboot, any other factors are irrelevant to this decision.
>>>>
>>>> why the serdev is still open after go through
>>>> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)
>>>> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that
>>>> serdev is not closed by hci_uart_close().
>>>
>>> Sounds like a logical jump to me, in fact hci_uart_close doesn't
>>> really change any of those flags, beside these flags are not really
>>> meant to tell the driver if serdev_device_close has been called or not
>>> which seems to be the intention with HCI_UART_PROTO_READY so how about
>>> we use that instead?
>>>
>> sorry for that i maybe not give good explanation, let me explain again.
>> hci_uart_close() is the only point which maybe close serdev before
>> qca_serdev_shutdown() is called, but for our case that
>> HCI_QUIRK_NON_PERSISTENT_SETUP is NOT set, hci_uart_close() will not
>> close serdev for our case, so serdev must be open state before sending
>> the VSC. so should not need other checking.
>>
> hello, i have paid attention to your discussion for a long time, i would
> like to join this discussion now.
> 
> The serdev is still open before sending the VSC for this patch.
> 
> are you agree with above Zijun's point?

I will say that this part of the discussion seems to be addressing KK's 
concerns.

@KK, is this accurate?

@Zijun, this description along with the info about the baud rate issues 
should probably be part of the commit message. In these last two 
messages, you have been much clearer about why this logic is needed and 
correct. I wish you'd provided this description from the beginning when 
KK asked for more information about the logic change.

>>> Another thing that is troubling me is that having traffic on shutdown
>>> is not common, specially if you are going to reboot, etc, and even if
>>> it doesn't get power cycle why don't you reset on probe rather than
>>> shutdown? That way we don't have to depend on what has been done in a
>>> previous boot, which can really become a problem in case of multi-OS
>>> where you have another system that may not be doing what you expect.
>> as you know, BT UART are working at 3M baudrate for normal usage.
>> we can't distinguish if SoC expects 3M or default 11.52K baudarate
>> during probe() after reboot. so we send the VSC within shutdown to make
>> sure SoC enter a initial state with 11.52 baudrate.
>>
>> for cold boot, SOC expects default 11.52K baudrate for probe().
>> for Enable BT -> warm boot, SOC expects 3M baudrate for probe().
>> we can't tell these two case within probe(). so need to send the VSC
>> within shutdown().
>>
> it seems the traffic within qca_serdev_shutdown() actually does software
> reset for BT SOC.
> 
>  From Zijun's points. the reasons why to do software reset within
> shutdown() instead of probe() maybe be shown below
> 1) it is impossible to do software reset within probe().
> 2) it seems it is easier to do it within shutdown() than probe.
> 
> Zijun's simple fix only change the condition and does NOT change the
> location to send the VSC, i think it maybe be other topic about location
> where(probe() or shutdown()) to do software reset.
> 
> are you agree with this point?

 From a practical standpoint, this change does seem to fix the warm boot 
issue on my laptop. I do not think that it would fix the issue of 
booting from an OS that puts the hardware into an unknown state.

> For concern about multi-OS, i would like to show my points.
> 
> the patch is for linux kernel, we maybe only need to care about linux
> OS. it maybe out-of-scope to make assumptions about other OSs vendor
> announced supported such as Windows OS we don't known much about.
> 
> are you agree with this point?

I would not fully agree with this point. However, I would agree 
completely with your proposal for moving forward (i.e. landing the change).

I would say that it is a problem if the kernel is not doing something to 
setup the hardware correctly in any case where it is technically 
possible. That is clearly a bug in the driver and Qualcomm should take 
responsibility for fixing this poor design. Luiz is totally right here.

Having said that, I don't see that bug as a blocker for a logic fix that 
creates an obvious (in my mind) UX improvement.

Here's my view of the situation. Right now, I experience a bug on every 
single warm boot or module reload).

After Zijun's improvement commit, I might experience this problem iif 
the right set of rare circumstances occurs (i.e. whenever I warm boot 
from an OS that puts the hardware in an unknown state, like the current 
mainline kernel to a kernel with the improvement).

In the world where the design problem of the init/shutdown sequences are 
fixed AND this logic change is applied, I might never see this problem 
again.

These seem like two different problems in my head. They don't seem 
directly logically related, and the blast radii of the problems are very 
different.

Here's the facts as I see them:
1. Logic improvement problem and init/shutdown sequence problem are 2 
orthogonal problems
2. Logic improvement greatly users' chances of running into the bad UX 
of the current code.
3. Init/shutdown isn't a trivial extension from the logic improvement.
4. Logic improvement has a pretty low cost to apply.
5. Init/shutdown sequence fix seems to be more fundamental and more 
intrusive.

All of these together indicate to me that the logic improvement should 
be landed. The init/shutdown issue should also be fixed, but a fix for 
that issue should not block the logic improvement.

My basic reasoning for this is that a visible UX fix should not e 
blocked for a change like the init/shutdown logic fix when the logic fix 
will help users.

However, Qualcomm does need to feel some pressure to fix their driver 
code. I would like to think that user's will not be held hostage for 
putting pressure on a vendor in this case as the balance seems very 
intrusive for users. It certainly feels intrusive as a user to have to 
be very careful about how I reboot my laptop.

>>>> see hci_uart_close() within drivers/bluetooth/hci_serdev.c
>>>> static int hci_uart_close(struct hci_dev *hdev)
>>>> {
>>>> ......
>>>>          /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver,
>>>>           * BT SOC is completely powered OFF during BT OFF, holding port
>>>>           * open may drain the battery.
>>>>           */
>>>>          if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
>>>>                  clear_bit(HCI_UART_PROTO_READY, &hu->flags);
>>>>                  serdev_device_close(hu->serdev);
>>>>          }
>>>>
>>>>          return 0;
>>>> }
>>>>
>>>>>> +               if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) ||
>>>>>> +                   hci_dev_test_flag(hdev, HCI_SETUP))
>>>>>>                          return;
>>>>>>
>>>>>>                  serdev_device_write_flush(serdev);
>>>>>> --
>>>>>> 2.7.4
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>>
>
Lk Sii May 11, 2024, 4:09 a.m. UTC | #11
On 2024/5/11 04:45, Wren Turkal wrote:
> On 5/7/24 6:48 AM, Lk Sii wrote:
>> On 2024/5/4 05:51, quic_zijuhu wrote:
>>> On 5/4/2024 5:25 AM, Luiz Augusto von Dentz wrote:
>>>> Hi,
>>>>
>>>> On Fri, May 3, 2024 at 4:18 PM quic_zijuhu <quic_zijuhu@quicinc.com>
>>>> wrote:
>>>>>
>>>>> On 5/4/2024 3:22 AM, Luiz Augusto von Dentz wrote:
>>>>>> Hi Zijun,
>>>>>>
>>>>>> On Thu, May 2, 2024 at 10:06 AM Zijun Hu <quic_zijuhu@quicinc.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on
>>>>>>> closed
>>>>>>> serdev") will cause below regression issue:
>>>>>>>
>>>>>>> BT can't be enabled after below steps:
>>>>>>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable
>>>>>>> failure
>>>>>>> if property enable-gpios is not configured within DT|ACPI for
>>>>>>> QCA6390.
>>>>>>>
>>>>>>> The commit is to fix a use-after-free issue within
>>>>>>> qca_serdev_shutdown()
>>>>>>> during reboot, but also introduces this regression issue
>>>>>>> regarding above
>>>>>>> steps since the VSC is not sent to reset controller during warm
>>>>>>> reboot.
>>>>>>>
>>>>>>> Fixed by sending the VSC to reset controller within
>>>>>>> qca_serdev_shutdown()
>>>>>>> once BT was ever enabled, and the use-after-free issue is also be
>>>>>>> fixed
>>>>>>> by this change since serdev is still opened when send to serdev.
>>>>>>>
>>>>>>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on
>>>>>>> closed serdev")
>>>>>>> Cc: stable@vger.kernel.org
>>>>>>> Reported-by: Wren Turkal <wt@penguintechs.org>
>>>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
>>>>>>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
>>>>>>> Tested-by: Wren Turkal <wt@penguintechs.org>
>>>>>>> ---
>>>>>>>   drivers/bluetooth/hci_qca.c | 5 ++---
>>>>>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/bluetooth/hci_qca.c
>>>>>>> b/drivers/bluetooth/hci_qca.c
>>>>>>> index 0c9c9ee56592..8e35c9091486 100644
>>>>>>> --- a/drivers/bluetooth/hci_qca.c
>>>>>>> +++ b/drivers/bluetooth/hci_qca.c
>>>>>>> @@ -2450,13 +2450,12 @@ static void qca_serdev_shutdown(struct
>>>>>>> device *dev)
>>>>>>>          struct qca_serdev *qcadev =
>>>>>>> serdev_device_get_drvdata(serdev);
>>>>>>>          struct hci_uart *hu = &qcadev->serdev_hu;
>>>>>>>          struct hci_dev *hdev = hu->hdev;
>>>>>>> -       struct qca_data *qca = hu->priv;
>>>>>>>          const u8 ibs_wake_cmd[] = { 0xFD };
>>>>>>>          const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01,
>>>>>>> 0x05 };
>>>>>>>
>>>>>>>          if (qcadev->btsoc_type == QCA_QCA6390) {
>>>>>>> -               if (test_bit(QCA_BT_OFF, &qca->flags) ||
>>>>>>> -                   !test_bit(HCI_RUNNING, &hdev->flags))
>>>>>>
>>>>>> This probably deserves a comment on why you end up with
>>>>>> HCI_QUIRK_NON_PERSISTENT_SETUP and HCI_SETUP flags here, also why you
>>>>>> are removing the flags above since that was introduce to prevent
>>>>>> use-after-free this sort of revert it so I do wonder how serdev can
>>>>>> still be open if you haven't tested for QCA_BT_OFF for example?
>>>>>>
>>>>> okay, let me give comments at next version.
>>>>> this design logic is shown below. you maybe review it.
>>>>>
>>>>> if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that hdev->setup()
>>>>> is able to be invoked by every open() to initializate SoC without any
>>>>> help. so we don't need to send the VSC to reset SoC into initial and
>>>>> clean state for the next hdev->setup() call success.
>>>>>
>>>>> otherwise, namely, HCI_QUIRK_NON_PERSISTENT_SETUP is not set.
>>>>>
>>>>> if HCI_SETUP is set, it means hdev->setup() was never be invoked,
>>>>> so the
>>>>> SOC is already in the initial and clean state, so we also don't
>>>>> need to
>>>>> send the VSC to reset SOC.
>>>>>
>>>>> otherwise, we need to send the VSC to reset Soc into a initial and
>>>>> clean
>>>>> state for hdev->setup() call success after "warm reboot -> enable BT"
>>>>>
>>>>> for the case commit message cares about, the only factor which
>>>>> decide to
>>>>> send the VSC is that SoC is a initial and clean state or not after
>>>>> warm
>>>>> reboot, any other factors are irrelevant to this decision.
>>>>>
>>>>> why the serdev is still open after go through
>>>>> (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)
>>>>> || hci_dev_test_flag(hdev, HCI_SETUP) checking is that
>>>>> serdev is not closed by hci_uart_close().
>>>>
>>>> Sounds like a logical jump to me, in fact hci_uart_close doesn't
>>>> really change any of those flags, beside these flags are not really
>>>> meant to tell the driver if serdev_device_close has been called or not
>>>> which seems to be the intention with HCI_UART_PROTO_READY so how about
>>>> we use that instead?
>>>>
>>> sorry for that i maybe not give good explanation, let me explain again.
>>> hci_uart_close() is the only point which maybe close serdev before
>>> qca_serdev_shutdown() is called, but for our case that
>>> HCI_QUIRK_NON_PERSISTENT_SETUP is NOT set, hci_uart_close() will not
>>> close serdev for our case, so serdev must be open state before sending
>>> the VSC. so should not need other checking.
>>>
>> hello, i have paid attention to your discussion for a long time, i would
>> like to join this discussion now.
>>
>> The serdev is still open before sending the VSC for this patch.
>>
>> are you agree with above Zijun's point?
> 
> I will say that this part of the discussion seems to be addressing KK's
> concerns.
> 
> @KK, is this accurate?
> 
> @Zijun, this description along with the info about the baud rate issues
> should probably be part of the commit message. In these last two
> messages, you have been much clearer about why this logic is needed and
> correct. I wish you'd provided this description from the beginning when
> KK asked for more information about the logic change.
> 
>>>> Another thing that is troubling me is that having traffic on shutdown
>>>> is not common, specially if you are going to reboot, etc, and even if
>>>> it doesn't get power cycle why don't you reset on probe rather than
>>>> shutdown? That way we don't have to depend on what has been done in a
>>>> previous boot, which can really become a problem in case of multi-OS
>>>> where you have another system that may not be doing what you expect.
>>> as you know, BT UART are working at 3M baudrate for normal usage.
>>> we can't distinguish if SoC expects 3M or default 11.52K baudarate
>>> during probe() after reboot. so we send the VSC within shutdown to make
>>> sure SoC enter a initial state with 11.52 baudrate.
>>>
>>> for cold boot, SOC expects default 11.52K baudrate for probe().
>>> for Enable BT -> warm boot, SOC expects 3M baudrate for probe().
>>> we can't tell these two case within probe(). so need to send the VSC
>>> within shutdown().
>>>
>> it seems the traffic within qca_serdev_shutdown() actually does software
>> reset for BT SOC.
>>
>>  From Zijun's points. the reasons why to do software reset within
>> shutdown() instead of probe() maybe be shown below
>> 1) it is impossible to do software reset within probe().
>> 2) it seems it is easier to do it within shutdown() than probe.
>>
>> Zijun's simple fix only change the condition and does NOT change the
>> location to send the VSC, i think it maybe be other topic about location
>> where(probe() or shutdown()) to do software reset.
>>
>> are you agree with this point?
> 
> From a practical standpoint, this change does seem to fix the warm boot
> issue on my laptop. I do not think that it would fix the issue of
> booting from an OS that puts the hardware into an unknown state.
>
we may not need to focus on below multi-OS issue Luiz assume

Other OS such as Windows -> warm boot -> boot linux OS -> enable BT failure.

firstly, we don't know and can't confirm that other OS don't the similar
jobs as linux kernel do, so can't confirm if the assumed issue is a real
and valid issue.

secondary, if other OS is supported as announced by vendor Qualcomm,
their BT driver for other OS maybe have contained solution for Multi-OS
relevant concerns.

>> For concern about multi-OS, i would like to show my points.
>>
>> the patch is for linux kernel, we maybe only need to care about linux
>> OS. it maybe out-of-scope to make assumptions about other OSs vendor
>> announced supported such as Windows OS we don't known much about.
>>
>> are you agree with this point?
> 
> I would not fully agree with this point. However, I would agree
> completely with your proposal for moving forward (i.e. landing the change).
> 
i agree with above Wren's point about moving forward.

> I would say that it is a problem if the kernel is not doing something to
> setup the hardware correctly in any case where it is technically
> possible. That is clearly a bug in the driver and Qualcomm should take
> responsibility for fixing this poor design. Luiz is totally right here.
>i don't agree with your above point that it is a bug and driver's poor
design, i don't actually understand how|who to conclude that present
design is clearly a bug and is poor design.

it seems this design(software reset within shutdown()) is dedicated
for the scenario that HOST doesn't have H/W resource to reset SOC if you
noticed below link
https://patchwork.kernel.org/project/bluetooth/patch/1713947712-4307-1-git-send-email-quic_zijuhu@quicinc.com/

For the scenario, perhaps, vendor Qualcomm have had more considerations
than you can image and finally selected this doable design.

as i commented previously. this patch doesn't touch software reset
location designed. so it maybe be out-of-scope to discuss why probe()
does not do the job.

> Having said that, I don't see that bug as a blocker for a logic fix that
> creates an obvious (in my mind) UX improvement.
> 
> Here's my view of the situation. Right now, I experience a bug on every
> single warm boot or module reload).
> 
> After Zijun's improvement commit, I might experience this problem iif
> the right set of rare circumstances occurs (i.e. whenever I warm boot
> from an OS that puts the hardware in an unknown state, like the current
> mainline kernel to a kernel with the improvement).
> > In the world where the design problem of the init/shutdown sequences are
> fixed AND this logic change is applied, I might never see this problem
> again.
> 
> These seem like two different problems in my head. They don't seem
> directly logically related, and the blast radii of the problems are very
> different.
> 
> Here's the facts as I see them:
> 1. Logic improvement problem and init/shutdown sequence problem are 2
> orthogonal problems
> 2. Logic improvement greatly users' chances of running into the bad UX
> of the current code.
> 3. Init/shutdown isn't a trivial extension from the logic improvement.
> 4. Logic improvement has a pretty low cost to apply.
> 5. Init/shutdown sequence fix seems to be more fundamental and more
> intrusive.
> 
> All of these together indicate to me that the logic improvement should
> be landed. The init/shutdown issue should also be fixed, but a fix for
> that issue should not block the logic improvement.
> 
> My basic reasoning for this is that a visible UX fix should not e
> blocked for a change like the init/shutdown logic fix when the logic fix
> will help users.
> 
> However, Qualcomm does need to feel some pressure to fix their driver
> code. I would like to think that user's will not be held hostage for
> putting pressure on a vendor in this case as the balance seems very
> intrusive for users. It certainly feels intrusive as a user to have to
> be very careful about how I reboot my laptop.
> 
>>>>> see hci_uart_close() within drivers/bluetooth/hci_serdev.c
>>>>> static int hci_uart_close(struct hci_dev *hdev)
>>>>> {
>>>>> ......
>>>>>          /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by
>>>>> driver,
>>>>>           * BT SOC is completely powered OFF during BT OFF, holding
>>>>> port
>>>>>           * open may drain the battery.
>>>>>           */
>>>>>          if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP,
>>>>> &hdev->quirks)) {
>>>>>                  clear_bit(HCI_UART_PROTO_READY, &hu->flags);
>>>>>                  serdev_device_close(hu->serdev);
>>>>>          }
>>>>>
>>>>>          return 0;
>>>>> }
>>>>>
>>>>>>> +               if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP,
>>>>>>> &hdev->quirks) ||
>>>>>>> +                   hci_dev_test_flag(hdev, HCI_SETUP))
>>>>>>>                          return;
>>>>>>>
>>>>>>>                  serdev_device_write_flush(serdev);
>>>>>>> -- 
>>>>>>> 2.7.4
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>
diff mbox series

Patch

diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 0c9c9ee56592..8e35c9091486 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -2450,13 +2450,12 @@  static void qca_serdev_shutdown(struct device *dev)
 	struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
 	struct hci_uart *hu = &qcadev->serdev_hu;
 	struct hci_dev *hdev = hu->hdev;
-	struct qca_data *qca = hu->priv;
 	const u8 ibs_wake_cmd[] = { 0xFD };
 	const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
 
 	if (qcadev->btsoc_type == QCA_QCA6390) {
-		if (test_bit(QCA_BT_OFF, &qca->flags) ||
-		    !test_bit(HCI_RUNNING, &hdev->flags))
+		if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) ||
+		    hci_dev_test_flag(hdev, HCI_SETUP))
 			return;
 
 		serdev_device_write_flush(serdev);