[v2,2/7] powerpc/kernel: Add uevents in EEH error/resume

Message ID	20171218223808.83928-3-bryantly@linux.vnet.ibm.com (mailing list archive)
State	New, archived
Delegated to:	Bjorn Helgaas
Headers	show Return-Path: <linux-pci-owner@kernel.org> Gateway: Authorized Use Only! Violators will be prosecuted for <linux-pci@vger.kernel.org> from <bryantly@linux.vnet.ibm.com>; Mon, 18 Dec 2017 17:38:31 -0500 Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 18 Dec 2017 17:38:27 -0500 From: "Bryant G. Ly" <bryantly@linux.vnet.ibm.com> To: benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au Cc: seroyer@linux.vnet.ibm.com, jjalvare@linux.vnet.ibm.com, alex.williamson@redhat.com, helgaas@kernel.org, aik@ozlabs.ru, ruscur@russell.cc, linux-pci@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, bodong@mellanox.com, eli@mellanox.com, saeedm@mellanox.com, "Bryant G. Ly" <bryantly@linux.vnet.ibm.com> Subject: [PATCH v2 2/7] powerpc/kernel: Add uevents in EEH error/resume Date: Mon, 18 Dec 2017 16:38:03 -0600 In-Reply-To: <20171218223808.83928-1-bryantly@linux.vnet.ibm.com> References: <20171218223808.83928-1-bryantly@linux.vnet.ibm.com> Message-Id: <20171218223808.83928-3-bryantly@linux.vnet.ibm.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk

Bryant G. Ly Dec. 18, 2017, 10:38 p.m. UTC

Devices can go offline when EEH is reported. This patch adds
a change to the kernel object and lets udev know of error.
When device resumes a change is also set reporting device as
online. Therefore, EEH events are better propagated to user
space for devices in powerpc arch.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Juan J. Alvarez <jjalvare@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_driver.c | 6 ++++++
 1 file changed, 6 insertions(+)

Bjorn Helgaas Dec. 19, 2017, 4:50 a.m. UTC | #1

[+cc Keith, Gabriele, Dongdong]

On Mon, Dec 18, 2017 at 04:38:03PM -0600, Bryant G. Ly wrote:
> Devices can go offline when EEH is reported. This patch adds
> a change to the kernel object and lets udev know of error.
> When device resumes a change is also set reporting device as
> online. Therefore, EEH events are better propagated to user
> space for devices in powerpc arch.

I'm on vacation and can't review this in detail, but I wonder if you
can compare this with the uevents we emit for DPC, AER, and hotplug
events (if any).  I hope we don't end up with userspace having to be
aware of the differences between EEH, DPC, AER, etc.

From a very quick look, I only see a few uevents even mentioned in
drivers/pci: KOBJ_ADD in __pci_hp_register() and KOBJ_CHANGE in the
SR-IOV code.  I'm worried that we're missing some important uevents in
the PCI core.  That's not an argument against what you're doing here;
it just would be nice to fill in any missing pieces in the core also,
and hopefully make them consistent with these EEH events.

> Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
> Signed-off-by: Juan J. Alvarez <jjalvare@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/eeh_driver.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 3c0fa99c5533..9d4e8177c2e0 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -204,6 +204,7 @@ static void *eeh_report_error(void *data, void *userdata)
>  	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
>  	enum pci_ers_result rc, *res = userdata;
>  	struct pci_driver *driver;
> +	char *envp[] = {"EVENT=EEH_ERROR", "ONLINE=0", NULL};
>  
>  	if (!dev || eeh_dev_removed(edev) || eeh_pe_passed(edev->pe))
>  		return NULL;
> @@ -228,6 +229,7 @@ static void *eeh_report_error(void *data, void *userdata)
>  
>  	edev->in_error = true;
>  	eeh_pcid_put(dev);
> +	kobject_uevent_env(&dev->dev.kobj, KOBJ_CHANGE, envp);
>  	return NULL;
>  }
>  
> @@ -358,6 +360,7 @@ static void *eeh_report_resume(void *data, void *userdata)
>  	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
>  	bool was_in_error;
>  	struct pci_driver *driver;
> +	char *envp[] = {"EVENT=EEH_RESUME", "ONLINE=1", NULL};
>  
>  	if (!dev || eeh_dev_removed(edev) || eeh_pe_passed(edev->pe))
>  		return NULL;
> @@ -381,6 +384,7 @@ static void *eeh_report_resume(void *data, void *userdata)
>  	driver->err_handler->resume(dev);
>  
>  	eeh_pcid_put(dev);
> +	kobject_uevent_env(&dev->dev.kobj, KOBJ_CHANGE, envp);
>  	return NULL;
>  }
>  
> @@ -397,6 +401,7 @@ static void *eeh_report_failure(void *data, void *userdata)
>  	struct eeh_dev *edev = (struct eeh_dev *)data;
>  	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
>  	struct pci_driver *driver;
> +	char * envp[] = {"EVENT=EEH_PERMANENT_FAILURE", "ONLINE=0", NULL};
>  
>  	if (!dev || eeh_dev_removed(edev) || eeh_pe_passed(edev->pe))
>  		return NULL;
> @@ -415,6 +420,7 @@ static void *eeh_report_failure(void *data, void *userdata)
>  
>  	driver->err_handler->error_detected(dev, pci_channel_io_perm_failure);
>  
> +	kobject_uevent_env(&dev->dev.kobj, KOBJ_CHANGE, envp);
>  	eeh_pcid_put(dev);
>  	return NULL;
>  }
> -- 
> 2.14.3 (Apple Git-98)
>

Russell Currey Dec. 19, 2017, 4:59 a.m. UTC | #2

On Mon, 2017-12-18 at 22:50 -0600, Bjorn Helgaas wrote:
> [+cc Keith, Gabriele, Dongdong]
> 
> On Mon, Dec 18, 2017 at 04:38:03PM -0600, Bryant G. Ly wrote:
> > Devices can go offline when EEH is reported. This patch adds
> > a change to the kernel object and lets udev know of error.
> > When device resumes a change is also set reporting device as
> > online. Therefore, EEH events are better propagated to user
> > space for devices in powerpc arch.
> 
> I'm on vacation and can't review this in detail, but I wonder if you
> can compare this with the uevents we emit for DPC, AER, and hotplug
> events (if any).  I hope we don't end up with userspace having to be
> aware of the differences between EEH, DPC, AER, etc.
> 
> From a very quick look, I only see a few uevents even mentioned in
> drivers/pci: KOBJ_ADD in __pci_hp_register() and KOBJ_CHANGE in the
> SR-IOV code.  I'm worried that we're missing some important uevents
> in
> the PCI core.  That's not an argument against what you're doing here;
> it just would be nice to fill in any missing pieces in the core also,
> and hopefully make them consistent with these EEH events.

I don't think this needs to be particularly complex, could we get away
with events for when devices do the following?

- begin recovery
- successfully recover
- fail recovery

It might be worthwhile sorting out some consistent, non-EEH-specific
naming, and then other device error recovery systems can do the same
later.

- Russell

> 
> > Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
> > Signed-off-by: Juan J. Alvarez <jjalvare@linux.vnet.ibm.com>

Benjamin Herrenschmidt Dec. 19, 2017, 6:27 a.m. UTC | #3

On Mon, 2017-12-18 at 22:50 -0600, Bjorn Helgaas wrote:
> [+cc Keith, Gabriele, Dongdong]
> 
> On Mon, Dec 18, 2017 at 04:38:03PM -0600, Bryant G. Ly wrote:
> > Devices can go offline when EEH is reported. This patch adds
> > a change to the kernel object and lets udev know of error.
> > When device resumes a change is also set reporting device as
> > online. Therefore, EEH events are better propagated to user
> > space for devices in powerpc arch.
> 
> I'm on vacation and can't review this in detail, but I wonder if you
> can compare this with the uevents we emit for DPC, AER, and hotplug
> events (if any).  I hope we don't end up with userspace having to be
> aware of the differences between EEH, DPC, AER, etc.
> 
> > From a very quick look, I only see a few uevents even mentioned in
> 
> drivers/pci: KOBJ_ADD in __pci_hp_register() and KOBJ_CHANGE in the
> SR-IOV code.  I'm worried that we're missing some important uevents in
> the PCI core.  That's not an argument against what you're doing here;
> it just would be nice to fill in any missing pieces in the core also,
> and hopefully make them consistent with these EEH events.

We also need to be careful about what specific EEH activity we are
talking about, and if we bring into the picture things like DPDK, it
gets even more murky...

The basic way EEH is supposed to work for recovery (minus all sort of
implementation nasties which hopefully Russell and Sam are trying to
cleanup and fix) is that either:

	- The driver of the device has recovery callbacks, in which
case the driver participates in the recovery process, the device
doesn't "go away" (though it shouldn't be accessed during that process
by other entities, userspace originated config space could be a problem
and needs to be blocked...). The recovery typically involves a reset of
the device but in sync with the driver.

	- The driver doesn't have the callbacks. In this case, we
simulate an unplug, reset the device, and replug.

So it makes sense for the second case to emit the same uevents as a
normal PCI(e) hotplug.

For the former case I'm less sure.... Do we really need userspace to be
notified ? If yes, what for precisely ?

Cheers,
Ben.

Juan Alvarez Dec. 21, 2017, 3:04 a.m. UTC | #4

On 12/18/17 10:59 PM, Russell Currey wrote:

> On Mon, 2017-12-18 at 22:50 -0600, Bjorn Helgaas wrote:
>> [+cc Keith, Gabriele, Dongdong]
>>
>> On Mon, Dec 18, 2017 at 04:38:03PM -0600, Bryant G. Ly wrote:
>>> Devices can go offline when EEH is reported. This patch adds
>>> a change to the kernel object and lets udev know of error.
>>> When device resumes a change is also set reporting device as
>>> online. Therefore, EEH events are better propagated to user
>>> space for devices in powerpc arch.
>> I'm on vacation and can't review this in detail, but I wonder if you
>> can compare this with the uevents we emit for DPC, AER, and hotplug
>> events (if any).  I hope we don't end up with userspace having to be
>> aware of the differences between EEH, DPC, AER, etc.
>>
>> From a very quick look, I only see a few uevents even mentioned in
>> drivers/pci: KOBJ_ADD in __pci_hp_register() and KOBJ_CHANGE in the
>> SR-IOV code.  I'm worried that we're missing some important uevents
>> in
>> the PCI core.  

The only place where I see the KOBJ_REMOVE being used is when the device is 
removed in pci_destroy_dev -> device_del whic will be called implicitly
in permanent failure path of EEH code

>> That's not an argument against what you're doing here;
>> it just would be nice to fill in any missing pieces in the core also,
>> and hopefully make them consistent with these EEH events.
> I don't think this needs to be particularly complex, could we get away
> with events for when devices do the following?
>
> - begin recovery
> - successfully recover
> - fail recovery

If there are no objections in the on going review of this patch 
I can change them to these names:

  - BEGIN_RECOVERY
  - SUCCESSFUL_RECOVERY
  - FAILED_RECOVERY

>
> It might be worthwhile sorting out some consistent, non-EEH-specific
> naming, and then other device error recovery systems can do the same
> later.
>
Do you have a more consistent naming in mind for these events? 

- Juan

Juan Alvarez Dec. 21, 2017, 3:04 a.m. UTC | #5

On 12/19/17 12:27 AM, Benjamin Herrenschmidt wrote:

> On Mon, 2017-12-18 at 22:50 -0600, Bjorn Helgaas wrote:
>> [+cc Keith, Gabriele, Dongdong]
>>
>> On Mon, Dec 18, 2017 at 04:38:03PM -0600, Bryant G. Ly wrote:
>>> Devices can go offline when EEH is reported. This patch adds
>>> a change to the kernel object and lets udev know of error.
>>> When device resumes a change is also set reporting device as
>>> online. Therefore, EEH events are better propagated to user
>>> space for devices in powerpc arch.
>> I'm on vacation and can't review this in detail, but I wonder if you
>> can compare this with the uevents we emit for DPC, AER, and hotplug
>> events (if any).  I hope we don't end up with userspace having to be
>> aware of the differences between EEH, DPC, AER, etc.
>>
>>> From a very quick look, I only see a few uevents even mentioned in
>> drivers/pci: KOBJ_ADD in __pci_hp_register() and KOBJ_CHANGE in the
>> SR-IOV code.  I'm worried that we're missing some important uevents in
>> the PCI core.  That's not an argument against what you're doing here;
>> it just would be nice to fill in any missing pieces in the core also,
>> and hopefully make them consistent with these EEH events.
> We also need to be careful about what specific EEH activity we are
> talking about, and if we bring into the picture things like DPDK, it
> gets even more murky...
>
> The basic way EEH is supposed to work for recovery (minus all sort of
> implementation nasties which hopefully Russell and Sam are trying to
> cleanup and fix) is that either:
>
> 	- The driver of the device has recovery callbacks, in which
> case the driver participates in the recovery process, the device
> doesn't "go away" (though it shouldn't be accessed during that process
> by other entities, userspace originated config space could be a problem
> and needs to be blocked...). The recovery typically involves a reset of
> the device but in sync with the driver.
>
> 	- The driver doesn't have the callbacks. In this case, we
> simulate an unplug, reset the device, and replug.
>
> So it makes sense for the second case to emit the same uevents as a
> normal PCI(e) hotplug.
>
> For the former case I'm less sure.... Do we really need userspace to be
> notified ? If yes, what for precisely ?

In pSeries SR-IOV environment the management console might need to apply
certain configuration changes to the PF driver after it has been recovered
and before the VF drivers are allowed to resume their recovery path.
I could not think of another way to notify user space of these events.
I made this assumption because I saw there were no uevents added when 
the device goes offline and come back online in EEH code. It was my 
intention to make the event as generic as possible in EEH component,
therefore, making this change independent of pSeries SR-IOV.

- Juan

Bjorn Helgaas Dec. 28, 2017, 11:22 p.m. UTC | #6

On Wed, Dec 20, 2017 at 09:04:27PM -0600, Juan Alvarez wrote:
> On 12/19/17 12:27 AM, Benjamin Herrenschmidt wrote:
> 
> > On Mon, 2017-12-18 at 22:50 -0600, Bjorn Helgaas wrote:
> >> [+cc Keith, Gabriele, Dongdong]
> >>
> >> On Mon, Dec 18, 2017 at 04:38:03PM -0600, Bryant G. Ly wrote:
> >>> Devices can go offline when EEH is reported. This patch adds
> >>> a change to the kernel object and lets udev know of error.
> >>> When device resumes a change is also set reporting device as
> >>> online. Therefore, EEH events are better propagated to user
> >>> space for devices in powerpc arch.
> >> I'm on vacation and can't review this in detail, but I wonder if you
> >> can compare this with the uevents we emit for DPC, AER, and hotplug
> >> events (if any).  I hope we don't end up with userspace having to be
> >> aware of the differences between EEH, DPC, AER, etc.
> >>
> >>> From a very quick look, I only see a few uevents even mentioned in
> >> drivers/pci: KOBJ_ADD in __pci_hp_register() and KOBJ_CHANGE in the
> >> SR-IOV code.  I'm worried that we're missing some important uevents in
> >> the PCI core.  That's not an argument against what you're doing here;
> >> it just would be nice to fill in any missing pieces in the core also,
> >> and hopefully make them consistent with these EEH events.
> > We also need to be careful about what specific EEH activity we are
> > talking about, and if we bring into the picture things like DPDK, it
> > gets even more murky...
> >
> > The basic way EEH is supposed to work for recovery (minus all sort of
> > implementation nasties which hopefully Russell and Sam are trying to
> > cleanup and fix) is that either:
> >
> > 	- The driver of the device has recovery callbacks, in which
> > case the driver participates in the recovery process, the device
> > doesn't "go away" (though it shouldn't be accessed during that process
> > by other entities, userspace originated config space could be a problem
> > and needs to be blocked...). The recovery typically involves a reset of
> > the device but in sync with the driver.
> >
> > 	- The driver doesn't have the callbacks. In this case, we
> > simulate an unplug, reset the device, and replug.
> >
> > So it makes sense for the second case to emit the same uevents as a
> > normal PCI(e) hotplug.
> >
> > For the former case I'm less sure.... Do we really need userspace to be
> > notified ? If yes, what for precisely ?
> 
> In pSeries SR-IOV environment the management console might need to apply
> certain configuration changes to the PF driver after it has been recovered
> and before the VF drivers are allowed to resume their recovery path.
> I could not think of another way to notify user space of these events.
> I made this assumption because I saw there were no uevents added when 
> the device goes offline and come back online in EEH code. It was my 
> intention to make the event as generic as possible in EEH component,
> therefore, making this change independent of pSeries SR-IOV.

I don't know what your plan for this is, but we do have two different
paths that use the struct pci_error_handlers hooks that drivers may
supply.  There's this AER path that may be used on all arches:

  aer_isr
    get_e_source              # remove from rpc->e_sources[] queue
    aer_isr_one_error
      aer_process_err_devices
        handle_error_source   # or aer_recover_work_func
          do_recovery         # for uncorrectable (fatal/nonfatal) only
            broadcast_error_message(dev, ..., report_error_detected)
              pci_walk_bus(..., report_error_detected)
                report_error_detected
                  dev->driver->err_handler->error_detected

And there's this powerpc path where you're adding a uevent:

  eeh_event_handler
    eeh_handle_event
      eeh_handle_normal_event
        eeh_pe_dev_traverse(pe, eeh_report_error, &result)
          eeh_report_error
            driver->err_handler->error_detected(dev, pci_channel_io_frozen)
 +          kobject_uevent_env(&dev->dev.kobj, KOBJ_CHANGE, envp);

Both paths end up calling the pci_error_handlers.error_detected()
hook.

Drivers are not supposed to care what arch they're running on.  If the
driver supplies an .error_detected() entry point, it's up to the PCI
core and powerpc code to use it consistently across arches.  That
means the same uevents (if any) should be emitted from both paths.

The best way would be to unify the call of .error_detected() so the
AER path and the powerpc path do it via the same function.  The AER
report_error_detected() and the powerpc eeh_report_error() do look
fairly similar, so this seems possible in principle, but I'm not
holding my breath.

Bjorn

Benjamin Herrenschmidt Dec. 29, 2017, 12:02 a.m. UTC | #7

On Thu, 2017-12-28 at 17:22 -0600, Bjorn Helgaas wrote:
> Both paths end up calling the pci_error_handlers.error_detected()
> hook.
> 
> Drivers are not supposed to care what arch they're running on.  If the
> driver supplies an .error_detected() entry point, it's up to the PCI
> core and powerpc code to use it consistently across arches.  That
> means the same uevents (if any) should be emitted from both paths.
> 
> The best way would be to unify the call of .error_detected() so the
> AER path and the powerpc path do it via the same function.  The AER
> report_error_detected() and the powerpc eeh_report_error() do look
> fairly similar, so this seems possible in principle, but I'm not
> holding my breath.

Factoring these callers into a common function that can then do the
uevent for errors makes a lot of sense.

The "resume" path might be trickier, but even then, rather than calling
directly the driver op, it would be easy to have a little wrapper that
does it, which can then also do the uevent.

Ben.

[v2,2/7] powerpc/kernel: Add uevents in EEH error/resume

Commit Message

Comments

Patch