diff mbox series

[v3,1/3] HID: i2c-hid: Rely on HID descriptor fetch to probe

Message ID 20240426225739.2166-2-kl@kl.wtf (mailing list archive)
State New
Delegated to: Jiri Kosina
Headers show
Series HID: i2c-hid: Probe and wake device with HID descriptor fetch | expand

Commit Message

Kenny Levinsen April 26, 2024, 10:47 p.m. UTC
To avoid error messages when a device is not present, b3a81b6c4fc6 added
an initial bus probe using a dummy i2c_smbus_read_byte() call.

Without this probe, i2c_hid_fetch_hid_descriptor() will fail internally
on a bus error and log. Treat the bus error as a missing device and
remove the error log so we can do away with the probe.

Tested-by: Lukasz Majczak <lma@chromium.org>
Reviewed-by: Lukasz Majczak <lma@chromium.org>
Signed-off-by: Kenny Levinsen <kl@kl.wtf>
---
 drivers/hid/i2c-hid/i2c-hid-core.c | 21 ++++++---------------
 1 file changed, 6 insertions(+), 15 deletions(-)

Comments

Dmitry Torokhov April 27, 2024, 3:21 a.m. UTC | #1
Hi Kenny, Lukasz,

On Sat, Apr 27, 2024 at 12:47:07AM +0200, Kenny Levinsen wrote:
> To avoid error messages when a device is not present, b3a81b6c4fc6 added
> an initial bus probe using a dummy i2c_smbus_read_byte() call.
> 
> Without this probe, i2c_hid_fetch_hid_descriptor() will fail internally
> on a bus error and log. Treat the bus error as a missing device and
> remove the error log so we can do away with the probe.
> 
> Tested-by: Lukasz Majczak <lma@chromium.org>
> Reviewed-by: Lukasz Majczak <lma@chromium.org>
> Signed-off-by: Kenny Levinsen <kl@kl.wtf>
> ---
>  drivers/hid/i2c-hid/i2c-hid-core.c | 21 ++++++---------------
>  1 file changed, 6 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
> index d965382196c6..6ffa43d245b4 100644
> --- a/drivers/hid/i2c-hid/i2c-hid-core.c
> +++ b/drivers/hid/i2c-hid/i2c-hid-core.c
> @@ -872,12 +872,11 @@ static int i2c_hid_fetch_hid_descriptor(struct i2c_hid *ihid)
>  					      ihid->wHIDDescRegister,
>  					      &ihid->hdesc,
>  					      sizeof(ihid->hdesc));
> -		if (error) {
> -			dev_err(&ihid->client->dev,
> -				"failed to fetch HID descriptor: %d\n",
> -				error);
> -			return -ENODEV;
> -		}
> +
> +		/* The i2c drivers are a bit inconsistent with their error
> +		 * codes, so treat everything as -ENXIO for now. */
> +		if (error)
> +			return -ENXIO;

I really think we should differentiate the cases "we do not know if
there is a device" vs "we do known that there is a device and we have
strong expectation of what that device is, and we do not expect
communication to fail".

Because of that I think we should have separate retry for the original
i2c_smbus_read_byte() (if you want you can make a wrapper around it,
something like i2c_hid_check_device_present()", and if there is a
concern that we will run into similar issue on resume (either from
suspend or from hibernate) then we can have similar retry in
i2c_hid_fetch_hid_descriptor().

But I do think that i2c_hid_fetch_hid_descriptor() should not try to
clobber the error but rather log the true one.

>  	}
>  
>  	/* Validate the length of HID descriptor, the 4 first bytes:
> @@ -992,17 +991,9 @@ static int __i2c_hid_core_probe(struct i2c_hid *ihid)
>  	struct hid_device *hid = ihid->hid;
>  	int ret;
>  
> -	/* Make sure there is something at this address */
> -	ret = i2c_smbus_read_byte(client);
> -	if (ret < 0) {
> -		i2c_hid_dbg(ihid, "nothing at this address: %d\n", ret);
> -		return -ENXIO;
> -	}
> -
>  	ret = i2c_hid_fetch_hid_descriptor(ihid);
>  	if (ret < 0) {
> -		dev_err(&client->dev,
> -			"Failed to fetch the HID Descriptor\n");
> +		i2c_hid_dbg(ihid, "failed to fetch HID descriptor: %d\n", ret);
>  		return ret;
>  	}
>  
> -- 
> 2.44.0
> 
> 

Thanks.
Kenny Levinsen April 27, 2024, 1:20 p.m. UTC | #2
On 4/27/24 5:21 AM, Dmitry Torokhov wrote:
> I really think we should differentiate the cases "we do not know if
> there is a device" vs "we do known that there is a device and we have
> strong expectation of what that device is, and we do not expect
> communication to fail".

My reasoning was that there is no difference between looking for address 
acknowledge on a probe read vs. a real command. Unfortunately, I ran 
into some issues with error code consistency that Doug highlighted...

Considering that the smbus probe bails on *any* error, it's really only 
ACK'd address + NACK'd register that remains, and I thought it maybe 
wouldn't be too harmful to just always have a debug log as suggested. 
However, I would still like *more* good errors by being specific about 
the error condition, and I plan to send some patches to get the number 
of drivers sending ENXIO up so we can comfortably rely on it in a future 
i2c-hid patch.

If you don't think it's acceptable to leave this as a pure debug print 
for now, I'll send a patch with just a minor clean-up and Łukasz' delays 
- then I'll try this again later when the driver situation has improved. 
I've been rapid-firing revisions, so I'll await an ACK this time... :)

---

For some context, I started looking at the i2c-hid driver after a recent 
regression around assumed Windows behavior, and found that the actual 
behavior differed a lot from our assumptions. Windows gets the job done 
notably quicker, with fewer messages and with shorter albeit differently 
placed delays.

My plan is to send patches that clean up and aligns our traffic more, 
speeding things up and hopefully deprecating some quirks. To that end, I 
would also like to get rid of the dummy read once we're comfortable with it.
Dmitry Torokhov April 30, 2024, 9:41 p.m. UTC | #3
On Sat, Apr 27, 2024 at 03:20:07PM +0200, Kenny Levinsen wrote:
> On 4/27/24 5:21 AM, Dmitry Torokhov wrote:
> > I really think we should differentiate the cases "we do not know if
> > there is a device" vs "we do known that there is a device and we have
> > strong expectation of what that device is, and we do not expect
> > communication to fail".
> 
> My reasoning was that there is no difference between looking for address
> acknowledge on a probe read vs. a real command. Unfortunately, I ran into
> some issues with error code consistency that Doug highlighted...

I actually believe there is. On Chromebooks we may source components
from several vendors and use them in our devices. The components
are electrically compatible with each other, have exactly the same
connector, and therefore interchangeable. Because of that at probe time
we do not quite know if the device is there at given address, or not
(i.e. the touchpad could be from a different vendor and listening on
another address) and we need to make a quick determination whether we
should continue with probe or not.

Once we decided that the probe should continue we commit to it and all
subsequent errors from the device should be treated as unexpected errors
worthy of logging. On resume we do not expect hardware configuration to
change, so again when resuming we also treat errors as errors and log
them and fail resume.

> 
> Considering that the smbus probe bails on *any* error, it's really only
> ACK'd address + NACK'd register that remains, and I thought it maybe
> wouldn't be too harmful to just always have a debug log as suggested.
> However, I would still like *more* good errors by being specific about the
> error condition, and I plan to send some patches to get the number of
> drivers sending ENXIO up so we can comfortably rely on it in a future
> i2c-hid patch.
> 
> If you don't think it's acceptable to leave this as a pure debug print for
> now, I'll send a patch with just a minor clean-up and Łukasz' delays - then
> I'll try this again later when the driver situation has improved. I've been
> rapid-firing revisions, so I'll await an ACK this time... :)
> 
> ---
> 
> For some context, I started looking at the i2c-hid driver after a recent
> regression around assumed Windows behavior, and found that the actual
> behavior differed a lot from our assumptions. Windows gets the job done
> notably quicker, with fewer messages and with shorter albeit differently
> placed delays.

I am not sure we can fully unify what Windows does and what Linux does,
mainly because our firmwares are different (I think Windows devices do a
lot more device discovery in firmware, Chrome OS historically tried to
limit amount of code in its firmware). We also need to make sure it
works on non-ACPI systems/ARM.

Thanks.
Kenny Levinsen May 1, 2024, 5:24 a.m. UTC | #4
On 4/30/24 11:41 PM, Dmitry Torokhov wrote:
> I actually believe there is. On Chromebooks we may source components
> from several vendors and use them in our devices. The components
> are electrically compatible with each other, have exactly the same
> connector, and therefore interchangeable. Because of that at probe time
> we do not quite know if the device is there at given address, or not
> (i.e. the touchpad could be from a different vendor and listening on
> another address) and we need to make a quick determination whether we
> should continue with probe or not.

Maybe I should clarify what I meant: All I2C operations start with the 
master writing the slave address to the bus. When a slave reads its own 
address off the bus, it pulls the data line low to ACK. If no device is 
present on the bus with the specified address, the line stays high which 
is a NACK. This means that on the bus level, we have a clear error 
condition specifically for no device with the specified address being 
present on the bus.

Whether the operation used is a dummy read or our first actual write 
should not matter - if the address is not acknowledged, the device is 
not present (or able to talk I2C). The problem lies in whether this "no 
device present on bus" error is reported clearly to us: Some drivers use 
-ENXIO specifically for this, some use it also for NACKs on written 
data, some report it but use other return codes for it, etc.

Even if we stick to the smbus probe in the long run, if we get to the 
point where we can rely on the error codes from I2C drivers we would be 
able to correctly log and propagate other error classes like bus errors 
or I2C driver issues which are all currently silenced as "nothing at 
address" by the smbus probe.

> I am not sure we can fully unify what Windows does and what Linux does,
> mainly because our firmwares are different (I think Windows devices do a
> lot more device discovery in firmware, Chrome OS historically tried to
> limit amount of code in its firmware). We also need to make sure it
> works on non-ACPI systems/ARM.

Good point. My main focus is also quirky behaviors we have added to 
replicate Windows behavior, the smbus probe just stood out in my bus traces.

I already sent 
https://lore.kernel.org/all/20240429233924.6453-1-kl@kl.wtf/ which goes 
back to improving the bus probe.
Dmitry Torokhov May 1, 2024, 7:09 p.m. UTC | #5
On Wed, May 01, 2024 at 07:24:08AM +0200, Kenny Levinsen wrote:
> On 4/30/24 11:41 PM, Dmitry Torokhov wrote:
> > I actually believe there is. On Chromebooks we may source components
> > from several vendors and use them in our devices. The components
> > are electrically compatible with each other, have exactly the same
> > connector, and therefore interchangeable. Because of that at probe time
> > we do not quite know if the device is there at given address, or not
> > (i.e. the touchpad could be from a different vendor and listening on
> > another address) and we need to make a quick determination whether we
> > should continue with probe or not.
> 
> Maybe I should clarify what I meant: All I2C operations start with the
> master writing the slave address to the bus. When a slave reads its own
> address off the bus, it pulls the data line low to ACK. If no device is
> present on the bus with the specified address, the line stays high which is
> a NACK. This means that on the bus level, we have a clear error condition
> specifically for no device with the specified address being present on the
> bus.
> 
> Whether the operation used is a dummy read or our first actual write should
> not matter - if the address is not acknowledged, the device is not present
> (or able to talk I2C).

Is it possible for a device to be wedged so hard that it refuses to
acknowledge the address?

> The problem lies in whether this "no device present
> on bus" error is reported clearly to us: Some drivers use -ENXIO
> specifically for this, some use it also for NACKs on written data, some
> report it but use other return codes for it, etc.
> 
> Even if we stick to the smbus probe in the long run, if we get to the point
> where we can rely on the error codes from I2C drivers we would be able to
> correctly log and propagate other error classes like bus errors or I2C
> driver issues which are all currently silenced as "nothing at address" by
> the smbus probe.

I think this depends on the answer to the question above. If there is
potential that the chip may stop responding, I still see benefit in
differentiating initial "soft touch" poke vs. hard errors once we
established that there is/was a device and it started misbehaving.

Thanks.
Kenny Levinsen May 1, 2024, 11:09 p.m. UTC | #6
On 5/1/24 9:09 PM, Dmitry Torokhov wrote:
> Is it possible for a device to be wedged so hard that it refuses to
> acknowledge the address?

A slave is allowed to not acknowledge if not able (e.g., "because it's 
performing some real time function"), but a slave that does not 
acknowledge its address is electrically indistinguishable from a 
disconnected device. In such case the device is impossible to detect 
through I2C operations, and neither smbus probe nor a "real" command 
will see it.

Any logic we have to silence missing devices will also silence entirely 
unresponsive or extremely non-cooperate devices. That is the price to 
pay for avoiding the log message unfortunately.

No other errors from the smbus probe or a real command would be related 
to device presence, and some of them even suggest a device is present 
but broken (arbitration loss, assuming no shorts).
diff mbox series

Patch

diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
index d965382196c6..6ffa43d245b4 100644
--- a/drivers/hid/i2c-hid/i2c-hid-core.c
+++ b/drivers/hid/i2c-hid/i2c-hid-core.c
@@ -872,12 +872,11 @@  static int i2c_hid_fetch_hid_descriptor(struct i2c_hid *ihid)
 					      ihid->wHIDDescRegister,
 					      &ihid->hdesc,
 					      sizeof(ihid->hdesc));
-		if (error) {
-			dev_err(&ihid->client->dev,
-				"failed to fetch HID descriptor: %d\n",
-				error);
-			return -ENODEV;
-		}
+
+		/* The i2c drivers are a bit inconsistent with their error
+		 * codes, so treat everything as -ENXIO for now. */
+		if (error)
+			return -ENXIO;
 	}
 
 	/* Validate the length of HID descriptor, the 4 first bytes:
@@ -992,17 +991,9 @@  static int __i2c_hid_core_probe(struct i2c_hid *ihid)
 	struct hid_device *hid = ihid->hid;
 	int ret;
 
-	/* Make sure there is something at this address */
-	ret = i2c_smbus_read_byte(client);
-	if (ret < 0) {
-		i2c_hid_dbg(ihid, "nothing at this address: %d\n", ret);
-		return -ENXIO;
-	}
-
 	ret = i2c_hid_fetch_hid_descriptor(ihid);
 	if (ret < 0) {
-		dev_err(&client->dev,
-			"Failed to fetch the HID Descriptor\n");
+		i2c_hid_dbg(ihid, "failed to fetch HID descriptor: %d\n", ret);
 		return ret;
 	}