Message ID | 1479544367-7195-4-git-send-email-ashok.raj@intel.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Sat, Nov 19, 2016 at 12:32:47AM -0800, Ashok Raj wrote: > A surprise link down may retrain very quickly, causing the same slot to > generate a link up event before handling the link down completes. > > Since the link is active, the power off work queued from the first link > down will cause a second down event when the power is disabled. The second > down event should be ignored because the slot is already powering off; > however, the "link up" event sets the slot state to POWERON before the > event to handle this is enqueued, making the second down event believe > it needs to do something. This creates a constant link up and down > event cycle. > > This patch fixes that by setting the slot state only when the work to > handle the power event is executing, protected by the hot plug mutex. Please mention the mutex specifically by name, e.g., "p_slot->hotplug_lock". > Cc: linux-kernel@vger.kernel.org > Cc: stable@vger.kernel.org > > Signed-off-by: Ashok Raj <ashok.raj@intel.com> > Reviewed-by: Keith Busch <keith.busch@intel.com> > --- > drivers/pci/hotplug/pciehp_ctrl.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > index ec0b4c1..7ae068c 100644 > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -182,6 +182,7 @@ static void pciehp_power_thread(struct work_struct *work) > switch (info->req) { > case DISABLE_REQ: > mutex_lock(&p_slot->hotplug_lock); > + p_slot->state = POWEROFF_STATE; It sounds right that p_slot->state should be protected. It looks like handle_button_press_event() and pciehp_sysfs_enable_slot() hold p_slot->lock while updating p_slot->state. You're setting "state = POWEROFF_STATE" while holding p_slot->hotplug_lock (not p_slot->lock). Four lines down, we set "state = STATIC_STATE", but this time we're holding p_slot->lock. What is the difference between the p_slot->lock and p_slot->hotplug_lock? Do we need both? How do we know which one to use? I'm very confused. > pciehp_disable_slot(p_slot); > mutex_unlock(&p_slot->hotplug_lock); > mutex_lock(&p_slot->lock); > @@ -190,6 +191,7 @@ static void pciehp_power_thread(struct work_struct *work) > break; > case ENABLE_REQ: > mutex_lock(&p_slot->hotplug_lock); > + p_slot->state = POWERON_STATE; > ret = pciehp_enable_slot(p_slot); > mutex_unlock(&p_slot->hotplug_lock); > if (ret) > @@ -209,8 +211,6 @@ static void pciehp_queue_power_work(struct slot *p_slot, int req) > { > struct power_work_info *info; > > - p_slot->state = (req == ENABLE_REQ) ? POWERON_STATE : POWEROFF_STATE; > - > info = kmalloc(sizeof(*info), GFP_KERNEL); > if (!info) { > ctrl_err(p_slot->ctrl, "no memory to queue %s request\n", > -- > 2.7.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 07, 2016 at 05:40:54PM -0600, Bjorn Helgaas wrote: > On Sat, Nov 19, 2016 at 12:32:47AM -0800, Ashok Raj wrote: > > --- a/drivers/pci/hotplug/pciehp_ctrl.c > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > > @@ -182,6 +182,7 @@ static void pciehp_power_thread(struct work_struct *work) > > switch (info->req) { > > case DISABLE_REQ: > > mutex_lock(&p_slot->hotplug_lock); > > + p_slot->state = POWEROFF_STATE; > > It sounds right that p_slot->state should be protected. > > It looks like handle_button_press_event() and > pciehp_sysfs_enable_slot() hold p_slot->lock while updating > p_slot->state. > > You're setting "state = POWEROFF_STATE" while holding > p_slot->hotplug_lock (not p_slot->lock). Four lines down, we set > "state = STATIC_STATE", but this time we're holding p_slot->lock. > > What is the difference between the p_slot->lock and > p_slot->hotplug_lock? Do we need both? How do we know which one to > use? > > I'm very confused. This is _very_ confusing. :) The p_slot->hotplug_lock serializes a power off and a power on event. We only want to set the state when one of those events gets to execute rather than when the event is queued. Changing the state when the event is queued can trigger a never ending up-down sequence. It currently looks safe to nest the p_slot->lock under the p_slot->hotplug_lock if that is you recommendation. Alternatively we could fix this if we used an ordered work queue for the slot's work, but that is a significantly more complex change. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 07, 2016 at 07:04:33PM -0500, Keith Busch wrote: > On Wed, Dec 07, 2016 at 05:40:54PM -0600, Bjorn Helgaas wrote: > > On Sat, Nov 19, 2016 at 12:32:47AM -0800, Ashok Raj wrote: > > > --- a/drivers/pci/hotplug/pciehp_ctrl.c > > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > > > @@ -182,6 +182,7 @@ static void pciehp_power_thread(struct work_struct *work) > > > switch (info->req) { > > > case DISABLE_REQ: > > > mutex_lock(&p_slot->hotplug_lock); > > > + p_slot->state = POWEROFF_STATE; > > > > It sounds right that p_slot->state should be protected. > > > > It looks like handle_button_press_event() and > > pciehp_sysfs_enable_slot() hold p_slot->lock while updating > > p_slot->state. > > > > You're setting "state = POWEROFF_STATE" while holding > > p_slot->hotplug_lock (not p_slot->lock). Four lines down, we set > > "state = STATIC_STATE", but this time we're holding p_slot->lock. > > > > What is the difference between the p_slot->lock and > > p_slot->hotplug_lock? Do we need both? How do we know which one to > > use? > > > > I'm very confused. > > This is _very_ confusing. :) > > The p_slot->hotplug_lock serializes a power off and a power on event. We > only want to set the state when one of those events gets to execute > rather than when the event is queued. Changing the state when the event > is queued can trigger a never ending up-down sequence. > > It currently looks safe to nest the p_slot->lock under the > p_slot->hotplug_lock if that is you recommendation. I'm not making a recommendation; that would take a lot more thought than I've given this. There are at least three locks in pciehp: struct controller.ctrl_lock struct slot.lock struct slot.hotplug_lock There shouldn't really be any performance paths in pciehp, so I'm pretty doubtful that we need such complicated locking. > Alternatively we could fix this if we used an ordered work queue for > the slot's work, but that is a significantly more complex change. You mean we can currently execute things from the work queue in a different order than they were enqueued? That sounds ... difficult to analyze, to say the least. I don't know much about work queues, and Documentation/workqueue.txt doesn't mention alloc_ordered_workqueue(). Is that what you are referring to? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 08, 2016 at 09:11:58AM -0600, Bjorn Helgaas wrote: > On Wed, Dec 07, 2016 at 07:04:33PM -0500, Keith Busch wrote: > > > > It currently looks safe to nest the p_slot->lock under the > > p_slot->hotplug_lock if that is you recommendation. > > I'm not making a recommendation; that would take a lot more thought > than I've given this. > > There are at least three locks in pciehp: > > struct controller.ctrl_lock > struct slot.lock > struct slot.hotplug_lock > > There shouldn't really be any performance paths in pciehp, so I'm > pretty doubtful that we need such complicated locking. They are protecting different things, but I agree it looks like room for simplification exists. > > Alternatively we could fix this if we used an ordered work queue for > > the slot's work, but that is a significantly more complex change. > > You mean we can currently execute things from the work queue in a > different order than they were enqueued? That sounds ... difficult to > analyze, to say the least. The events are dequeued in order, but they don't wait for the previous to complete, so pciehp's current work queue can have multiple events executing in parallel. That's part of why rapid pciehp slot events are a little more difficult to follow, and I think we may even be unsafely relying on the order the mutexes are obtained from these work events. Partly unrelated, we could process surprise removals significantly faster (microseconds vs. seconds), with the limited pci access series here, giving fewer simultaneously executing events to consider: https://www.spinics.net/lists/linux-pci/msg55585.html Do you have any concerns with that series? > I don't know much about work queues, and Documentation/workqueue.txt > doesn't mention alloc_ordered_workqueue(). Is that what you are > referring to? Yes, the alloc_ordered_workqueue is what I had in mind, though switching to that is not as simple as calling the different API. I am looking into that for longer term, but for the incremental fix, do you think we can go forward with Raj's proposal? -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 08, 2016 at 12:20:58PM -0500, Keith Busch wrote: > On Thu, Dec 08, 2016 at 09:11:58AM -0600, Bjorn Helgaas wrote: > > On Wed, Dec 07, 2016 at 07:04:33PM -0500, Keith Busch wrote: > > > > > > It currently looks safe to nest the p_slot->lock under the > > > p_slot->hotplug_lock if that is you recommendation. > > > > I'm not making a recommendation; that would take a lot more thought > > than I've given this. > > > > There are at least three locks in pciehp: > > > > struct controller.ctrl_lock > > struct slot.lock > > struct slot.hotplug_lock > > > > There shouldn't really be any performance paths in pciehp, so I'm > > pretty doubtful that we need such complicated locking. > > They are protecting different things, but I agree it looks like room > for simplification exists. If we can't simplify this immediately, can we add a comment about what the different locks protect so people have a hint about which one to use? For example, it looks like this patch might benefit from that knowledge. > > > Alternatively we could fix this if we used an ordered work queue for > > > the slot's work, but that is a significantly more complex change. > > > > You mean we can currently execute things from the work queue in a > > different order than they were enqueued? That sounds ... difficult to > > analyze, to say the least. > > The events are dequeued in order, but they don't wait for the previous > to complete, so pciehp's current work queue can have multiple events > executing in parallel. That's part of why rapid pciehp slot events are > a little more difficult to follow, and I think we may even be unsafely > relying on the order the mutexes are obtained from these work events. Hmm. I certainly did not expect multiple events executing in parallel. That sounds like a pretty serious issue to me. > Partly unrelated, we could process surprise removals significantly > faster (microseconds vs. seconds), with the limited pci access series > here, giving fewer simultaneously executing events to consider: > > https://www.spinics.net/lists/linux-pci/msg55585.html > > Do you have any concerns with that series? I'm dragging my feet because I want the removal process to become simpler to understand, not more complicated, and we're exposing more issues that I didn't even know about. > > I don't know much about work queues, and Documentation/workqueue.txt > > doesn't mention alloc_ordered_workqueue(). Is that what you are > > referring to? > > Yes, the alloc_ordered_workqueue is what I had in mind, though switching > to that is not as simple as calling the different API. I am looking into > that for longer term, but for the incremental fix, do you think we can > go forward with Raj's proposal? I'd like to at least see a consistent locking strategy for protecting p_slot->state. All the existing updates are protected by p_slot->lock, but the one Raj is adding is protected by p_slot->hotplug_lock. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Bjorn Thanks. I have resent the last patch again with consistent lock usage as you had requested. I did attempt to make things a bit more easier to understand in one my earlier experiments, but that resulted in very substantial changes. Certainly something we should look in future to make the state machine more robust. Cheers, Ashok On Thu, Dec 08, 2016 at 11:50:09AM -0600, Bjorn Helgaas wrote: > > Yes, the alloc_ordered_workqueue is what I had in mind, though switching > > to that is not as simple as calling the different API. I am looking into > > that for longer term, but for the incremental fix, do you think we can > > go forward with Raj's proposal? > > I'd like to at least see a consistent locking strategy for protecting > p_slot->state. All the existing updates are protected by > p_slot->lock, but the one Raj is adding is protected by > p_slot->hotplug_lock. > > Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index ec0b4c1..7ae068c 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -182,6 +182,7 @@ static void pciehp_power_thread(struct work_struct *work) switch (info->req) { case DISABLE_REQ: mutex_lock(&p_slot->hotplug_lock); + p_slot->state = POWEROFF_STATE; pciehp_disable_slot(p_slot); mutex_unlock(&p_slot->hotplug_lock); mutex_lock(&p_slot->lock); @@ -190,6 +191,7 @@ static void pciehp_power_thread(struct work_struct *work) break; case ENABLE_REQ: mutex_lock(&p_slot->hotplug_lock); + p_slot->state = POWERON_STATE; ret = pciehp_enable_slot(p_slot); mutex_unlock(&p_slot->hotplug_lock); if (ret) @@ -209,8 +211,6 @@ static void pciehp_queue_power_work(struct slot *p_slot, int req) { struct power_work_info *info; - p_slot->state = (req == ENABLE_REQ) ? POWERON_STATE : POWEROFF_STATE; - info = kmalloc(sizeof(*info), GFP_KERNEL); if (!info) { ctrl_err(p_slot->ctrl, "no memory to queue %s request\n",