Message ID | 1573567588-47048-7-git-send-email-alex.shi@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | per lruvec lru_lock for memcg | expand |
On Tue, Nov 12, 2019 at 10:06:26PM +0800, Alex Shi wrote: > Intel 0day report there are performance regression on this patchset. > The detailed info points to rcu_read_lock + PROVE_LOCKING which causes > queued_spin_lock_slowpath waiting too long time to get lock. > Remove rcu_read_lock is safe here since we had a spinlock hold. Argh. You have not sent these patches in a properly reviewable form! I wasted all that time reviewing the earlier patch in this series only to find out that you changed it here. FIX THE PATCH, don't send a fix-patch on top of it! > Reported-by: kbuild test robot <lkp@intel.com> > Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Roman Gushchin <guro@fb.com> > Cc: Shakeel Butt <shakeelb@google.com> > Cc: Chris Down <chris@chrisdown.name> > Cc: Tejun Heo <tj@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > --- > include/linux/memcontrol.h | 29 ++++++++++++----------------- > 1 file changed, 12 insertions(+), 17 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 2421b720d272..f869897a68f0 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -1307,20 +1307,18 @@ static inline struct lruvec *relock_page_lruvec_irq(struct page *page, > struct pglist_data *pgdat = page_pgdat(page); > struct lruvec *lruvec; > > - rcu_read_lock(); > + if (!locked_lruvec) > + goto lock; > + > lruvec = mem_cgroup_page_lruvec(page, pgdat); > > - if (locked_lruvec == lruvec) { > - rcu_read_unlock(); > + if (locked_lruvec == lruvec) > return lruvec; > - } > - rcu_read_unlock(); > > - if (locked_lruvec) > - spin_unlock_irq(&locked_lruvec->lru_lock); > + spin_unlock_irq(&locked_lruvec->lru_lock); > > +lock: > lruvec = lock_page_lruvec_irq(page, pgdat); > - > return lruvec; > } > > @@ -1331,21 +1329,18 @@ static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page, > struct pglist_data *pgdat = page_pgdat(page); > struct lruvec *lruvec; > > - rcu_read_lock(); > + if (!locked_lruvec) > + goto lock; > + > lruvec = mem_cgroup_page_lruvec(page, pgdat); > > - if (locked_lruvec == lruvec) { > - rcu_read_unlock(); > + if (locked_lruvec == lruvec) > return lruvec; > - } > - rcu_read_unlock(); > > - if (locked_lruvec) > - spin_unlock_irqrestore(&locked_lruvec->lru_lock, > - locked_lruvec->flags); > + spin_unlock_irqrestore(&locked_lruvec->lru_lock, locked_lruvec->flags); > > +lock: > lruvec = lock_page_lruvec_irqsave(page, pgdat); > - > return lruvec; > } > > -- > 1.8.3.1 > >
在 2019/11/12 下午10:38, Matthew Wilcox 写道: > On Tue, Nov 12, 2019 at 10:06:26PM +0800, Alex Shi wrote: >> Intel 0day report there are performance regression on this patchset. >> The detailed info points to rcu_read_lock + PROVE_LOCKING which causes >> queued_spin_lock_slowpath waiting too long time to get lock. >> Remove rcu_read_lock is safe here since we had a spinlock hold. > Argh. You have not sent these patches in a properly reviewable form! > I wasted all that time reviewing the earlier patch in this series only to > find out that you changed it here. FIX THE PATCH, don't send a fix-patch > on top of it! > Hi Matthew, Very sorry for your time! The main reasons I use a separate patch since a, Intel 0day asking me to credit their are founding, and I don't know how to give a clearly/elegant explanation for a non-exist regression in a fixed patch. b, this regression is kindly pretty tricky. Maybe it's better saying thanks in version change log of cover-letter? Anyway, Thanks a lot for your review! Alex
On Wed, Nov 13, 2019 at 10:40:58AM +0800, Alex Shi wrote: > > > ?? 2019/11/12 ????10:38, Matthew Wilcox ????: > > On Tue, Nov 12, 2019 at 10:06:26PM +0800, Alex Shi wrote: > >> Intel 0day report there are performance regression on this patchset. > >> The detailed info points to rcu_read_lock + PROVE_LOCKING which causes > >> queued_spin_lock_slowpath waiting too long time to get lock. > >> Remove rcu_read_lock is safe here since we had a spinlock hold. > > Argh. You have not sent these patches in a properly reviewable form! > > I wasted all that time reviewing the earlier patch in this series only to > > find out that you changed it here. FIX THE PATCH, don't send a fix-patch > > on top of it! > > > > Hi Matthew, > > Very sorry for your time! The main reasons I use a separate patch since a, Intel 0day asking me to credit their are founding, and I don't know how to give a clearly/elegant explanation for a non-exist regression in a fixed patch. b, this regression is kindly pretty tricky. Maybe it's better saying thanks in version change log of cover-letter? > Add something like this to the patch [lkp@intel.com: Fix RCU-related regression reported by LKP robot] Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> ...
在 2019/11/13 下午7:40, Mel Gorman 写道: >> Hi Matthew, >> >> Very sorry for your time! The main reasons I use a separate patch since a, Intel 0day asking me to credit their are founding, and I don't know how to give a clearly/elegant explanation for a non-exist regression in a fixed patch. b, this regression is kindly pretty tricky. Maybe it's better saying thanks in version change log of cover-letter? >> > Add something like this to the patch > > [lkp@intel.com: Fix RCU-related regression reported by LKP robot] > Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> It's a good idea! Thanks a lot, Mel!
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 2421b720d272..f869897a68f0 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1307,20 +1307,18 @@ static inline struct lruvec *relock_page_lruvec_irq(struct page *page, struct pglist_data *pgdat = page_pgdat(page); struct lruvec *lruvec; - rcu_read_lock(); + if (!locked_lruvec) + goto lock; + lruvec = mem_cgroup_page_lruvec(page, pgdat); - if (locked_lruvec == lruvec) { - rcu_read_unlock(); + if (locked_lruvec == lruvec) return lruvec; - } - rcu_read_unlock(); - if (locked_lruvec) - spin_unlock_irq(&locked_lruvec->lru_lock); + spin_unlock_irq(&locked_lruvec->lru_lock); +lock: lruvec = lock_page_lruvec_irq(page, pgdat); - return lruvec; } @@ -1331,21 +1329,18 @@ static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page, struct pglist_data *pgdat = page_pgdat(page); struct lruvec *lruvec; - rcu_read_lock(); + if (!locked_lruvec) + goto lock; + lruvec = mem_cgroup_page_lruvec(page, pgdat); - if (locked_lruvec == lruvec) { - rcu_read_unlock(); + if (locked_lruvec == lruvec) return lruvec; - } - rcu_read_unlock(); - if (locked_lruvec) - spin_unlock_irqrestore(&locked_lruvec->lru_lock, - locked_lruvec->flags); + spin_unlock_irqrestore(&locked_lruvec->lru_lock, locked_lruvec->flags); +lock: lruvec = lock_page_lruvec_irqsave(page, pgdat); - return lruvec; }
Intel 0day report there are performance regression on this patchset. The detailed info points to rcu_read_lock + PROVE_LOCKING which causes queued_spin_lock_slowpath waiting too long time to get lock. Remove rcu_read_lock is safe here since we had a spinlock hold. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Chris Down <chris@chrisdown.name> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- include/linux/memcontrol.h | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-)