Message ID | 20210607131442.20121-1-heikki.krogerus@linux.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | usb: typec: ucsi: Polling the alt modes and PDOs | expand |
Hi Heikki, On Mon, 2021-06-07 at 16:14 +0300, Heikki Krogerus wrote: > This is the RFC series I promised [1]. Cool. > I'm sorry it took this long to prepare these. I had to concentrate on > other task for a while. > > Let me know if you still see any problems. Hmm, I am not sure this got better. I applied the patchset on top of of the 5.12.9 Fedora 34 kernel. On the machine in question (X1 Carbon 8), I see the online state getting stuck at 1 occasionally. This can happen for example when quickly plugging and unplugging a USB-C charger. Benjamin > [1] > https://lore.kernel.org/linux-usb/YKT3oEt%2F9fX8k8mw@kuha.fi.intel.com/ > > thanks, > > Heikki Krogerus (7): > usb: typec: ucsi: Always cancel the command if PPM reports BUSY > condition > usb: typec: ucsi: Don't stop alt mode registration on busy > condition > usb: typec: ucsi: Add poll worker for alternate modes > usb: typec: ucsi: acpi: Reduce the command completion timeout > usb: typec: ucsi: Process every connector change as unique > connector > state > usb: typec: ucsi: Filter out spurious events > usb: typec: ucsi: Read the PDOs in separate work > > drivers/usb/typec/ucsi/ucsi.c | 317 ++++++++++++++++----------- > -- > drivers/usb/typec/ucsi/ucsi.h | 3 +- > drivers/usb/typec/ucsi/ucsi_acpi.c | 2 +- > 3 files changed, 183 insertions(+), 139 deletions(-) >
On Mon, Jun 07, 2021 at 10:09:58PM +0200, Benjamin Berg wrote: > Hi Heikki, > > On Mon, 2021-06-07 at 16:14 +0300, Heikki Krogerus wrote: > > This is the RFC series I promised [1]. > > Cool. > > > I'm sorry it took this long to prepare these. I had to concentrate on > > other task for a while. > > > > Let me know if you still see any problems. > > Hmm, I am not sure this got better. I applied the patchset on top of of > the 5.12.9 Fedora 34 kernel. On the machine in question (X1 Carbon 8), > I see the online state getting stuck at 1 occasionally. This can happen > for example when quickly plugging and unplugging a USB-C charger. Please check does the partner device get removed. What do you have under /sys/class/typec after that happens? thanks,
On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > On Mon, Jun 07, 2021 at 10:09:58PM +0200, Benjamin Berg wrote: > > Hi Heikki, > > > > On Mon, 2021-06-07 at 16:14 +0300, Heikki Krogerus wrote: > > > This is the RFC series I promised [1]. > > > > Cool. > > > > > I'm sorry it took this long to prepare these. I had to concentrate on > > > other task for a while. > > > > > > Let me know if you still see any problems. > > > > Hmm, I am not sure this got better. I applied the patchset on top of of > > the 5.12.9 Fedora 34 kernel. On the machine in question (X1 Carbon 8), > > I see the online state getting stuck at 1 occasionally. This can happen > > for example when quickly plugging and unplugging a USB-C charger. > > Please check does the partner device get removed. What do you have > under /sys/class/typec after that happens? Oh yes. Could you also share the trace output when that happens? cd /sys/kernel/debug/tracing echo 1 > events/ucsi/enable # now reproduce the issue cat trace > ucsi.trace thanks,
On Tue, 2021-06-08 at 09:54 +0300, Heikki Krogerus wrote: > On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > > Please check does the partner device get removed. What do you have > > under /sys/class/typec after that happens? > > Oh yes. Could you also share the trace output when that happens? > > cd /sys/kernel/debug/tracing > echo 1 > events/ucsi/enable > # now reproduce the issue > cat trace > ucsi.trace So, the partner device is still there when this happens (see below). I also only see a single event in the trace for the fast plug/unplug case: kworker/u16:8-1771 [003] .... 18848.872145: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 The typec port/partner states is: port1-partner/accessory_mode:none port1-partner/supports_usb_power_delivery:no port1-partner/uevent:DEVTYPE=typec_partner port1-partner/usb_power_delivery_revision:0.0 and port0/data_role:host [device] port0/power_operation_mode:default port0/power_role:source [sink] port0/supported_accessory_modes:none port0/uevent:DEVTYPE=typec_port port0/uevent:TYPEC_PORT=port0 port0/usb_power_delivery_revision:2.0 port0/usb_typec_revision:1.0 port0/vconn_source:no port1/data_role:host [device] port1/power_operation_mode:3.0A port1/power_role:source [sink] port1/supported_accessory_modes:none port1/uevent:DEVTYPE=typec_port port1/uevent:TYPEC_PORT=port1 port1/usb_power_delivery_revision:2.0 port1/usb_typec_revision:1.0 port1/vconn_source:no Note that for a normal plug I am usually getting a second event. This second event is occasionally missing though: kworker/u16:38-1800 [001] .... 19522.325885: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:38-1800 [004] .... 19522.552613: ucsi_connector_change: port1 status: change=0044, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 and a working unplug consistently looks like: kworker/u16:8-1771 [003] .... 19670.020085: ucsi_connector_change: port1 status: change=4804, opmode=0, connected=0, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=0 Benjamin
On Tue, Jun 08, 2021 at 09:32:01PM +0200, Benjamin Berg wrote: > On Tue, 2021-06-08 at 09:54 +0300, Heikki Krogerus wrote: > > On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > > > Please check does the partner device get removed. What do you have > > > under /sys/class/typec after that happens? > > > > Oh yes. Could you also share the trace output when that happens? > > > > cd /sys/kernel/debug/tracing > > echo 1 > events/ucsi/enable > > # now reproduce the issue > > cat trace > ucsi.trace > > So, the partner device is still there when this happens (see below). I > also only see a single event in the trace for the fast plug/unplug > case: > kworker/u16:8-1771 [003] .... 18848.872145: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 OK. Sorry I had to double check because you were only talking about the psy online state. Can you now try this HACK on top of these patches: diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index bd39fe2cb1d0b..99f072700ce7f 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -843,7 +843,8 @@ static void ucsi_handle_connector_change(struct work_struct *work) if (!status.change) { dev_dbg(con->ucsi->dev, "con%d: spurious event\n", con->num); - goto out_ack; + /* XXX Force connection check. */ + status.change = UCSI_CONSTAT_CONNECT_CHANGE; } event = kzalloc(sizeof(*event), GFP_KERNEL); thanks,
On Wed, Jun 09, 2021 at 02:26:00PM +0300, Heikki Krogerus wrote: > On Tue, Jun 08, 2021 at 09:32:01PM +0200, Benjamin Berg wrote: > > On Tue, 2021-06-08 at 09:54 +0300, Heikki Krogerus wrote: > > > On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > > > > Please check does the partner device get removed. What do you have > > > > under /sys/class/typec after that happens? > > > > > > Oh yes. Could you also share the trace output when that happens? > > > > > > cd /sys/kernel/debug/tracing > > > echo 1 > events/ucsi/enable > > > # now reproduce the issue > > > cat trace > ucsi.trace > > > > So, the partner device is still there when this happens (see below). I > > also only see a single event in the trace for the fast plug/unplug > > case: > > kworker/u16:8-1771 [003] .... 18848.872145: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 > > OK. Sorry I had to double check because you were only talking about > the psy online state. > > Can you now try this HACK on top of these patches: > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c > index bd39fe2cb1d0b..99f072700ce7f 100644 > --- a/drivers/usb/typec/ucsi/ucsi.c > +++ b/drivers/usb/typec/ucsi/ucsi.c > @@ -843,7 +843,8 @@ static void ucsi_handle_connector_change(struct work_struct *work) > > if (!status.change) { > dev_dbg(con->ucsi->dev, "con%d: spurious event\n", con->num); > - goto out_ack; > + /* XXX Force connection check. */ > + status.change = UCSI_CONSTAT_CONNECT_CHANGE; > } > > event = kzalloc(sizeof(*event), GFP_KERNEL); No, that's not enough. Sorry. I'm trying to get a confirmation on my suspecion that we do always actually get an event from the EC firmware, but we just end up filtering it out in this case because we are too slow in the driver. I have an idea what could be done about that, but I need to test if that really is the case. I'll prepare a new version out of this entire series. thanks,
On Wed, Jun 09, 2021 at 03:18:04PM +0300, Heikki Krogerus wrote: > On Wed, Jun 09, 2021 at 02:26:00PM +0300, Heikki Krogerus wrote: > > On Tue, Jun 08, 2021 at 09:32:01PM +0200, Benjamin Berg wrote: > > > On Tue, 2021-06-08 at 09:54 +0300, Heikki Krogerus wrote: > > > > On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > > > > > Please check does the partner device get removed. What do you have > > > > > under /sys/class/typec after that happens? > > > > > > > > Oh yes. Could you also share the trace output when that happens? > > > > > > > > cd /sys/kernel/debug/tracing > > > > echo 1 > events/ucsi/enable > > > > # now reproduce the issue > > > > cat trace > ucsi.trace > > > > > > So, the partner device is still there when this happens (see below). I > > > also only see a single event in the trace for the fast plug/unplug > > > case: > > > kworker/u16:8-1771 [003] .... 18848.872145: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 > > > > OK. Sorry I had to double check because you were only talking about > > the psy online state. > > > > Can you now try this HACK on top of these patches: > > > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c > > index bd39fe2cb1d0b..99f072700ce7f 100644 > > --- a/drivers/usb/typec/ucsi/ucsi.c > > +++ b/drivers/usb/typec/ucsi/ucsi.c > > @@ -843,7 +843,8 @@ static void ucsi_handle_connector_change(struct work_struct *work) > > > > if (!status.change) { > > dev_dbg(con->ucsi->dev, "con%d: spurious event\n", con->num); > > - goto out_ack; > > + /* XXX Force connection check. */ > > + status.change = UCSI_CONSTAT_CONNECT_CHANGE; > > } > > > > event = kzalloc(sizeof(*event), GFP_KERNEL); > > No, that's not enough. Sorry. > > I'm trying to get a confirmation on my suspecion that we do always > actually get an event from the EC firmware, but we just end up > filtering it out in this case because we are too slow in the driver. I > have an idea what could be done about that, but I need to test if that > really is the case. > > I'll prepare a new version out of this entire series. Actually, it's easier if you could just test this attached patch on top of this series. It makes sure the every single event is considered. I'm sorry about the hassle. thanks,
On Wed, 2021-06-09 at 15:56 +0300, Heikki Krogerus wrote: > On Wed, Jun 09, 2021 at 03:18:04PM +0300, Heikki Krogerus wrote: > > > > I'm trying to get a confirmation on my suspecion that we do always > > actually get an event from the EC firmware, but we just end up > > filtering it out in this case because we are too slow in the driver. > > I > > have an idea what could be done about that, but I need to test if > > that > > really is the case. > > > > I'll prepare a new version out of this entire series. > > Actually, it's easier if you could just test this attached patch on > top of this series. It makes sure the every single event is > considered. I'm sorry about the hassle. No worries! I probably should have included some more information earlier (i.e. enabling the debug print for spurious events). With the patch, I am seeing the following on plug kworker/u16:1-6847 [002] .... 1375.485010: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:2-6848 [006] .... 1375.561811: ucsi_connector_change: port1 status: change=4000, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:2-6848 [007] .... 1375.634275: ucsi_connector_change: port1 status: change=4000, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:2-6848 [003] .... 1375.743161: ucsi_connector_change: port1 status: change=4000, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 and unplug kworker/u16:1-6847 [005] .... 1394.062501: ucsi_connector_change: port1 status: change=4804, opmode=0, connected=0, sourcing=0, partner_flags=0, partner_type=0, request_data_obj=00000000, BC status=0 kworker/u16:1-6847 [005] .... 1394.161612: ucsi_connector_change: port1 status: change=4000, opmode=0, connected=0, sourcing=0, partner_flags=0, partner_type=0, request_data_obj=00000000, BC status=0 kworker/u16:1-6847 [005] .... 1394.251503: ucsi_connector_change: port1 status: change=4000, opmode=0, connected=0, sourcing=0, partner_flags=0, partner_type=0, request_data_obj=00000000, BC status=0 where all but the first event are spurious events. I believe that in the above spurious event with the change to opmode=3, the PPM should be reporting change=0004 (i.e. UCSI_CONSTAT_POWER_OPMODE_CHANGE). Occasionally I also see the following on plug. Note the non-spurious event with change=0040 (UCSI_CONSTAT_POWER_LEVEL_CHANGE) right before the event where opmode changes. kworker/u16:11-2201 [001] .... 3240.124431: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [003] .... 3240.222799: ucsi_connector_change: port1 status: change=4000, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [003] .... 3240.325946: ucsi_connector_change: port1 status: change=0040, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [003] .... 3240.423503: ucsi_connector_change: port1 status: change=4000, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [003] .... 3240.861986: ucsi_connector_change: port1 status: change=4000, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [007] .... 3240.999048: ucsi_connector_change: port1 status: change=4000, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 My thought when I first ran into the issue was that the PPM simply resets the change bitfield on ACK, effectively discarding any changes that happened after the last GET_CONNECTOR_STATUS call. I believed to have confirmed this by inserting an msleep in between. However, I have to admit that I have never really looked for alternative explanations. Benjamin
On Wed, Jun 09, 2021 at 07:39:41PM +0200, Benjamin Berg wrote: > My thought when I first ran into the issue was that the PPM simply > resets the change bitfield on ACK, effectively discarding any changes > that happened after the last GET_CONNECTOR_STATUS call. I believed to > have confirmed this by inserting an msleep in between. > However, I have to admit that I have never really looked for > alternative explanations. Thanks a lot for testing these. I now have pretty good idea about the problem. The problem is not what you though it is, but the driver is also not too slow like I suspected. I'm quite certain now that the PPM simply does not create the event at all in this case, regardless of what the driver does. We do need to workaround that problem, but I think we can do it in a better way. I have an idea for that. thanks,