diff mbox series

[v3,01/25] PCI: Add PCI_ERROR_RESPONSE and it's related definitions

Message ID f7960a4dee0e417eedd7d2e031d04ac9016c6686.1634825082.git.naveennaidu479@gmail.com (mailing list archive)
State Not Applicable
Headers show
Series Unify PCI error response checking | expand

Commit Message

Naveen Naidu Oct. 21, 2021, 3:07 p.m. UTC
An MMIO read from a PCI device that doesn't exist or doesn't respond
causes a PCI error.  There's no real data to return to satisfy the
CPU read, so most hardware fabricates ~0 data.

Add a PCI_ERROR_RESPONSE definition for that and use it where
appropriate to make these checks consistent and easier to find.

Also add helper definitions SET_PCI_ERROR_RESPONSE and
RESPONSE_IS_PCI_ERROR to make the code more readable.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
---
 include/linux/pci.h | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Pali Rohár Oct. 21, 2021, 4:08 p.m. UTC | #1
On Thursday 21 October 2021 20:37:26 Naveen Naidu wrote:
> An MMIO read from a PCI device that doesn't exist or doesn't respond
> causes a PCI error.  There's no real data to return to satisfy the
> CPU read, so most hardware fabricates ~0 data.
> 
> Add a PCI_ERROR_RESPONSE definition for that and use it where
> appropriate to make these checks consistent and easier to find.
> 
> Also add helper definitions SET_PCI_ERROR_RESPONSE and
> RESPONSE_IS_PCI_ERROR to make the code more readable.
> 
> Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
> Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>

Reviewed-by: Pali Rohár <pali@kernel.org>

> ---
>  include/linux/pci.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index cd8aa6fce204..689c8277c584 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -154,6 +154,15 @@ enum pci_interrupt_pin {
>  /* The number of legacy PCI INTx interrupts */
>  #define PCI_NUM_INTX	4
>  
> +/*
> + * Reading from a device that doesn't respond typically returns ~0.  A
> + * successful read from a device may also return ~0, so you need additional
> + * information to reliably identify errors.
> + */
> +#define PCI_ERROR_RESPONSE     (~0ULL)
> +#define SET_PCI_ERROR_RESPONSE(val)    (*(val) = ((typeof(*(val))) PCI_ERROR_RESPONSE))
> +#define RESPONSE_IS_PCI_ERROR(val) ((val) == ((typeof(val)) PCI_ERROR_RESPONSE))
> +
>  /*
>   * pci_power_t values must match the bits in the Capabilities PME_Support
>   * and Control/Status PowerState fields in the Power Management capability.
> -- 
> 2.25.1
>
Bjorn Helgaas Nov. 17, 2021, 11:58 p.m. UTC | #2
On Thu, Oct 21, 2021 at 08:37:26PM +0530, Naveen Naidu wrote:
> An MMIO read from a PCI device that doesn't exist or doesn't respond
> causes a PCI error.  There's no real data to return to satisfy the
> CPU read, so most hardware fabricates ~0 data.
> 
> Add a PCI_ERROR_RESPONSE definition for that and use it where
> appropriate to make these checks consistent and easier to find.
> 
> Also add helper definitions SET_PCI_ERROR_RESPONSE and
> RESPONSE_IS_PCI_ERROR to make the code more readable.
> 
> Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
> Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
> ---
>  include/linux/pci.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index cd8aa6fce204..689c8277c584 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -154,6 +154,15 @@ enum pci_interrupt_pin {
>  /* The number of legacy PCI INTx interrupts */
>  #define PCI_NUM_INTX	4
>  
> +/*
> + * Reading from a device that doesn't respond typically returns ~0.  A
> + * successful read from a device may also return ~0, so you need additional
> + * information to reliably identify errors.
> + */
> +#define PCI_ERROR_RESPONSE     (~0ULL)
> +#define SET_PCI_ERROR_RESPONSE(val)    (*(val) = ((typeof(*(val))) PCI_ERROR_RESPONSE))
> +#define RESPONSE_IS_PCI_ERROR(val) ((val) == ((typeof(val)) PCI_ERROR_RESPONSE))

Beautiful!  I really like this.

I would prefer the macros to start with "PCI_", e.g.,
PCI_SET_ERROR_RESPONSE().

I think "RESPONSE_IS_PCI_ERROR()" is too strong because (as the
comment says), ~0 *may* indicate an error.  Or it may be a successful
read of a register that happens to contain ~0.

Possibilities to convey the idea that this isn't definitive:

  PCI_POSSIBLE_ERROR_RESPONSE(val)  # a little long
  PCI_LIKELY_ERROR(val)             # we really have no idea whether
  PCI_PROBABLE_ERROR(val)           #   likely or probable
  PCI_POSSIBLE_ERROR(val)           # promising?

Can you rebase to my "main" branch (v5.16-rc1), tweak the above, and
collect up the acks/reviews?

We should also browse drivers outside drivers/pci for places we could
use these.  Not necessarily as part of this series, although if
authors there object, it would be good to learn that earlier than
later.

Drivers that implement pci_error_handlers might be a fruitful place to
start.  But you've done a great job finding users of ~0 and 0xffff...
in drivers/pci/, too.

> +
>  /*
>   * pci_power_t values must match the bits in the Capabilities PME_Support
>   * and Control/Status PowerState fields in the Power Management capability.
> -- 
> 2.25.1
> 
> _______________________________________________
> Linux-kernel-mentees mailing list
> Linux-kernel-mentees@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
Naveen Naidu Nov. 18, 2021, 1:30 p.m. UTC | #3
On 17/11, Bjorn Helgaas wrote:
> On Thu, Oct 21, 2021 at 08:37:26PM +0530, Naveen Naidu wrote:
> > An MMIO read from a PCI device that doesn't exist or doesn't respond
> > causes a PCI error.  There's no real data to return to satisfy the
> > CPU read, so most hardware fabricates ~0 data.
> > 
> > Add a PCI_ERROR_RESPONSE definition for that and use it where
> > appropriate to make these checks consistent and easier to find.
> > 
> > Also add helper definitions SET_PCI_ERROR_RESPONSE and
> > RESPONSE_IS_PCI_ERROR to make the code more readable.
> > 
> > Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
> > Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
> > ---
> >  include/linux/pci.h | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index cd8aa6fce204..689c8277c584 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -154,6 +154,15 @@ enum pci_interrupt_pin {
> >  /* The number of legacy PCI INTx interrupts */
> >  #define PCI_NUM_INTX	4
> >  
> > +/*
> > + * Reading from a device that doesn't respond typically returns ~0.  A
> > + * successful read from a device may also return ~0, so you need additional
> > + * information to reliably identify errors.
> > + */
> > +#define PCI_ERROR_RESPONSE     (~0ULL)
> > +#define SET_PCI_ERROR_RESPONSE(val)    (*(val) = ((typeof(*(val))) PCI_ERROR_RESPONSE))
> > +#define RESPONSE_IS_PCI_ERROR(val) ((val) == ((typeof(val)) PCI_ERROR_RESPONSE))
> 
> Beautiful!  I really like this.
>

Thank you very much for the review ^^

> I would prefer the macros to start with "PCI_", e.g.,
> PCI_SET_ERROR_RESPONSE().
> 

ACK

> I think "RESPONSE_IS_PCI_ERROR()" is too strong because (as the
> comment says), ~0 *may* indicate an error.  Or it may be a successful
> read of a register that happens to contain ~0.
> 
> Possibilities to convey the idea that this isn't definitive:
> 
>   PCI_POSSIBLE_ERROR_RESPONSE(val)  # a little long
>   PCI_LIKELY_ERROR(val)             # we really have no idea whether
>   PCI_PROBABLE_ERROR(val)           #   likely or probable
>   PCI_POSSIBLE_ERROR(val)           # promising?
>

ACK. Will use PCI_POSSIBLE_ERROR()

> Can you rebase to my "main" branch (v5.16-rc1), tweak the above, and
> collect up the acks/reviews?
> 

ACK

> We should also browse drivers outside drivers/pci for places we could
> use these.  Not necessarily as part of this series, although if
> authors there object, it would be good to learn that earlier than
> later.
> 
> Drivers that implement pci_error_handlers might be a fruitful place to
> start.  But you've done a great job finding users of ~0 and 0xffff...
> in drivers/pci/, too.
> 

A quick grep showed that there are around 80 drivers which have
pci_error_handlers. I was thinking that it would be better if we handle
these drivers in another patch series since the current patch series is
itself 25 patches long. And in my short tenure reading LKML, I gathered
that folks generally are not so kind to a long list of patches in a
single patch series ^^' (I might be wrong though, Apologies)

The consensus on the patch series does seem slightly positive so
ideally, I was hoping that we would not have the case where a author
does not like the way we are handling this patch. Then again, I'm
pretty sure that I might be wrong ^^'

I hope it would be okay that I send in a new patch series with the
suggested changes and handle the other changes in another patch series
^^

Thanks,
Naveen
> > +
> >  /*
> >   * pci_power_t values must match the bits in the Capabilities PME_Support
> >   * and Control/Status PowerState fields in the Power Management capability.
> > -- 
> > 2.25.1
> > 
> > _______________________________________________
> > Linux-kernel-mentees mailing list
> > Linux-kernel-mentees@lists.linuxfoundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
diff mbox series

Patch

diff --git a/include/linux/pci.h b/include/linux/pci.h
index cd8aa6fce204..689c8277c584 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -154,6 +154,15 @@  enum pci_interrupt_pin {
 /* The number of legacy PCI INTx interrupts */
 #define PCI_NUM_INTX	4
 
+/*
+ * Reading from a device that doesn't respond typically returns ~0.  A
+ * successful read from a device may also return ~0, so you need additional
+ * information to reliably identify errors.
+ */
+#define PCI_ERROR_RESPONSE     (~0ULL)
+#define SET_PCI_ERROR_RESPONSE(val)    (*(val) = ((typeof(*(val))) PCI_ERROR_RESPONSE))
+#define RESPONSE_IS_PCI_ERROR(val) ((val) == ((typeof(val)) PCI_ERROR_RESPONSE))
+
 /*
  * pci_power_t values must match the bits in the Capabilities PME_Support
  * and Control/Status PowerState fields in the Power Management capability.