Message ID | 1569479378-128669-1-git-send-email-zhangxiaoxu5@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request | expand |
ping. On 2019/9/26 14:29, ZhangXiaoxu wrote: > When xfstests testing, there are some WARNING as below: > > WARNING: CPU: 0 PID: 6235 at fs/nfs/inode.c:122 nfs_clear_inode+0x9c/0xd8 > Modules linked in: > CPU: 0 PID: 6235 Comm: umount.nfs > Hardware name: linux,dummy-virt (DT) > pstate: 60000005 (nZCv daif -PAN -UAO) > pc : nfs_clear_inode+0x9c/0xd8 > lr : nfs_evict_inode+0x60/0x78 > sp : fffffc000f68fc00 > x29: fffffc000f68fc00 x28: fffffe00c53155c0 > x27: fffffe00c5315000 x26: fffffc0009a63748 > x25: fffffc000f68fd18 x24: fffffc000bfaaf40 > x23: fffffc000936d3c0 x22: fffffe00c4ff5e20 > x21: fffffc000bfaaf40 x20: fffffe00c4ff5d10 > x19: fffffc000c056000 x18: 000000000000003c > x17: 0000000000000000 x16: 0000000000000000 > x15: 0000000000000040 x14: 0000000000000228 > x13: fffffc000c3a2000 x12: 0000000000000045 > x11: 0000000000000000 x10: 0000000000000000 > x9 : 0000000000000000 x8 : 0000000000000000 > x7 : 0000000000000000 x6 : fffffc00084b027c > x5 : fffffc0009a64000 x4 : fffffe00c0e77400 > x3 : fffffc000c0563a8 x2 : fffffffffffffffb > x1 : 000000000000764e x0 : 0000000000000001 > Call trace: > nfs_clear_inode+0x9c/0xd8 > nfs_evict_inode+0x60/0x78 > evict+0x108/0x380 > dispose_list+0x70/0xa0 > evict_inodes+0x194/0x210 > generic_shutdown_super+0xb0/0x220 > nfs_kill_super+0x40/0x88 > deactivate_locked_super+0xb4/0x120 > deactivate_super+0x144/0x160 > cleanup_mnt+0x98/0x148 > __cleanup_mnt+0x38/0x50 > task_work_run+0x114/0x160 > do_notify_resume+0x2f8/0x308 > work_pending+0x8/0x14 > > The nrequest should be increased/decreased only if PG_INODE_REF flag > was setted. > > But in the nfs_inode_remove_request function, it maybe decrease when > no PG_INODE_REF flag, this maybe lead nrequests count error. > > Reported-by: Hulk Robot <hulkci@huawei.com> > Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> > --- > fs/nfs/write.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > index 85ca495..52cab65 100644 > --- a/fs/nfs/write.c > +++ b/fs/nfs/write.c > @@ -786,7 +786,6 @@ static void nfs_inode_remove_request(struct nfs_page *req) > struct nfs_inode *nfsi = NFS_I(inode); > struct nfs_page *head; > > - atomic_long_dec(&nfsi->nrequests); > if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) { > head = req->wb_head; > > @@ -799,8 +798,10 @@ static void nfs_inode_remove_request(struct nfs_page *req) > spin_unlock(&mapping->private_lock); > } > > - if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) > + if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) { > nfs_release_request(req); > + atomic_long_dec(&nfsi->nrequests); > + } > } > > static void >
On Sat, 2019-10-05 at 09:51 +0800, zhangxiaoxu (A) wrote: > ping. > > On 2019/9/26 14:29, ZhangXiaoxu wrote: > > When xfstests testing, there are some WARNING as below: > > > > WARNING: CPU: 0 PID: 6235 at fs/nfs/inode.c:122 > > nfs_clear_inode+0x9c/0xd8 > > Modules linked in: > > CPU: 0 PID: 6235 Comm: umount.nfs > > Hardware name: linux,dummy-virt (DT) > > pstate: 60000005 (nZCv daif -PAN -UAO) > > pc : nfs_clear_inode+0x9c/0xd8 > > lr : nfs_evict_inode+0x60/0x78 > > sp : fffffc000f68fc00 > > x29: fffffc000f68fc00 x28: fffffe00c53155c0 > > x27: fffffe00c5315000 x26: fffffc0009a63748 > > x25: fffffc000f68fd18 x24: fffffc000bfaaf40 > > x23: fffffc000936d3c0 x22: fffffe00c4ff5e20 > > x21: fffffc000bfaaf40 x20: fffffe00c4ff5d10 > > x19: fffffc000c056000 x18: 000000000000003c > > x17: 0000000000000000 x16: 0000000000000000 > > x15: 0000000000000040 x14: 0000000000000228 > > x13: fffffc000c3a2000 x12: 0000000000000045 > > x11: 0000000000000000 x10: 0000000000000000 > > x9 : 0000000000000000 x8 : 0000000000000000 > > x7 : 0000000000000000 x6 : fffffc00084b027c > > x5 : fffffc0009a64000 x4 : fffffe00c0e77400 > > x3 : fffffc000c0563a8 x2 : fffffffffffffffb > > x1 : 000000000000764e x0 : 0000000000000001 > > Call trace: > > nfs_clear_inode+0x9c/0xd8 > > nfs_evict_inode+0x60/0x78 > > evict+0x108/0x380 > > dispose_list+0x70/0xa0 > > evict_inodes+0x194/0x210 > > generic_shutdown_super+0xb0/0x220 > > nfs_kill_super+0x40/0x88 > > deactivate_locked_super+0xb4/0x120 > > deactivate_super+0x144/0x160 > > cleanup_mnt+0x98/0x148 > > __cleanup_mnt+0x38/0x50 > > task_work_run+0x114/0x160 > > do_notify_resume+0x2f8/0x308 > > work_pending+0x8/0x14 > > > > The nrequest should be increased/decreased only if PG_INODE_REF > > flag > > was setted. > > > > But in the nfs_inode_remove_request function, it maybe decrease > > when > > no PG_INODE_REF flag, this maybe lead nrequests count error. > > > > Reported-by: Hulk Robot <hulkci@huawei.com> > > Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> > > --- > > fs/nfs/write.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > > index 85ca495..52cab65 100644 > > --- a/fs/nfs/write.c > > +++ b/fs/nfs/write.c > > @@ -786,7 +786,6 @@ static void nfs_inode_remove_request(struct > > nfs_page *req) > > struct nfs_inode *nfsi = NFS_I(inode); > > struct nfs_page *head; > > > > - atomic_long_dec(&nfsi->nrequests); > > if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) { > > head = req->wb_head; > > > > @@ -799,8 +798,10 @@ static void nfs_inode_remove_request(struct > > nfs_page *req) > > spin_unlock(&mapping->private_lock); > > } > > > > - if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) > > + if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) { > > nfs_release_request(req); > > + atomic_long_dec(&nfsi->nrequests); > > + } > > } > > > > static void > > Hmm... What about nfs_page_group_init()? That also bumps nfsi- >nrequests.
On Sat, 2019-10-05 at 14:07 +0000, Trond Myklebust wrote: > On Sat, 2019-10-05 at 09:51 +0800, zhangxiaoxu (A) wrote: > > ping. > > > > On 2019/9/26 14:29, ZhangXiaoxu wrote: > > > When xfstests testing, there are some WARNING as below: > > > > > > WARNING: CPU: 0 PID: 6235 at fs/nfs/inode.c:122 > > > nfs_clear_inode+0x9c/0xd8 > > > Modules linked in: > > > CPU: 0 PID: 6235 Comm: umount.nfs > > > Hardware name: linux,dummy-virt (DT) > > > pstate: 60000005 (nZCv daif -PAN -UAO) > > > pc : nfs_clear_inode+0x9c/0xd8 > > > lr : nfs_evict_inode+0x60/0x78 > > > sp : fffffc000f68fc00 > > > x29: fffffc000f68fc00 x28: fffffe00c53155c0 > > > x27: fffffe00c5315000 x26: fffffc0009a63748 > > > x25: fffffc000f68fd18 x24: fffffc000bfaaf40 > > > x23: fffffc000936d3c0 x22: fffffe00c4ff5e20 > > > x21: fffffc000bfaaf40 x20: fffffe00c4ff5d10 > > > x19: fffffc000c056000 x18: 000000000000003c > > > x17: 0000000000000000 x16: 0000000000000000 > > > x15: 0000000000000040 x14: 0000000000000228 > > > x13: fffffc000c3a2000 x12: 0000000000000045 > > > x11: 0000000000000000 x10: 0000000000000000 > > > x9 : 0000000000000000 x8 : 0000000000000000 > > > x7 : 0000000000000000 x6 : fffffc00084b027c > > > x5 : fffffc0009a64000 x4 : fffffe00c0e77400 > > > x3 : fffffc000c0563a8 x2 : fffffffffffffffb > > > x1 : 000000000000764e x0 : 0000000000000001 > > > Call trace: > > > nfs_clear_inode+0x9c/0xd8 > > > nfs_evict_inode+0x60/0x78 > > > evict+0x108/0x380 > > > dispose_list+0x70/0xa0 > > > evict_inodes+0x194/0x210 > > > generic_shutdown_super+0xb0/0x220 > > > nfs_kill_super+0x40/0x88 > > > deactivate_locked_super+0xb4/0x120 > > > deactivate_super+0x144/0x160 > > > cleanup_mnt+0x98/0x148 > > > __cleanup_mnt+0x38/0x50 > > > task_work_run+0x114/0x160 > > > do_notify_resume+0x2f8/0x308 > > > work_pending+0x8/0x14 > > > > > > The nrequest should be increased/decreased only if PG_INODE_REF > > > flag > > > was setted. > > > > > > But in the nfs_inode_remove_request function, it maybe decrease > > > when > > > no PG_INODE_REF flag, this maybe lead nrequests count error. > > > > > > Reported-by: Hulk Robot <hulkci@huawei.com> > > > Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> > > > --- > > > fs/nfs/write.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > > > index 85ca495..52cab65 100644 > > > --- a/fs/nfs/write.c > > > +++ b/fs/nfs/write.c > > > @@ -786,7 +786,6 @@ static void nfs_inode_remove_request(struct > > > nfs_page *req) > > > struct nfs_inode *nfsi = NFS_I(inode); > > > struct nfs_page *head; > > > > > > - atomic_long_dec(&nfsi->nrequests); > > > if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) { > > > head = req->wb_head; > > > > > > @@ -799,8 +798,10 @@ static void nfs_inode_remove_request(struct > > > nfs_page *req) > > > spin_unlock(&mapping->private_lock); > > > } > > > > > > - if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) > > > + if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) { > > > nfs_release_request(req); > > > + atomic_long_dec(&nfsi->nrequests); > > > + } > > > } > > > > > > static void > > > > > Hmm... What about nfs_page_group_init()? That also bumps nfsi- > > nrequests. Ah... Never mind, that's just copying the PG_INODE_REF flag to the new subrequest. However nfs_lock_and_join_requests() looks like it does need to change to something like the following: /* Postpone destruction of this request */ - if (test_and_clear_bit(PG_REMOVE, &head->wb_flags)) { - set_bit(PG_INODE_REF, &head->wb_flags); + if (test_and_clear_bit(PG_REMOVE, &head->wb_flags) && + !test_and_set_bit(PG_INODE_REF, &head->wb_flags)) { kref_get(&head->wb_kref); atomic_long_inc(&NFS_I(inode)->nrequests); } Do you agree?
Thanks for your review. I think PG_REMOVE and PG_INODE_REF can't be both setted except in 'nfs_inode_remove_request' function. nfs_inode_remove_request // maybe set PG_REMOVE here. nfs_page_group_sync_on_bit(req, PG_REMOVE) nfs_page_group_lock(req); nfs_page_group_sync_on_bit_locked WARN_ON_ONCE(test_and_set_bit(bit, &req->wb_flags)); nfs_page_group_unlock(req); // But also clear the PG_INODE_REF flag. test_and_clear_bit(PG_INODE_REF, &req->wb_flags) 'nfs_lock_and_join_requests' also need the PG_HEADLOCK flag: nfs_lock_and_join_requests nfs_page_group_lock(head); test_and_clear_bit(PG_REMOVE, &head->wb_flags) nfs_page_group_unlock(head); 在 2019/10/5 22:35, Trond Myklebust 写道: > However nfs_lock_and_join_requests() looks like it does need to change > to something like the following: > > /* Postpone destruction of this request */ > - if (test_and_clear_bit(PG_REMOVE, &head->wb_flags)) { > - set_bit(PG_INODE_REF, &head->wb_flags); > + if (test_and_clear_bit(PG_REMOVE, &head->wb_flags) && > + !test_and_set_bit(PG_INODE_REF, &head->wb_flags)) { > kref_get(&head->wb_kref); > atomic_long_inc(&NFS_I(inode)->nrequests); > } > > > Do you agree?
On Tue, 2019-10-08 at 10:00 +0800, zhangxiaoxu (A) wrote: > Thanks for your review. > > I think PG_REMOVE and PG_INODE_REF can't be both setted except in > 'nfs_inode_remove_request' function. > > nfs_inode_remove_request > // maybe set PG_REMOVE here. > nfs_page_group_sync_on_bit(req, PG_REMOVE) > nfs_page_group_lock(req); > nfs_page_group_sync_on_bit_locked > WARN_ON_ONCE(test_and_set_bit(bit, &req- > >wb_flags)); > nfs_page_group_unlock(req); > // But also clear the PG_INODE_REF flag. > test_and_clear_bit(PG_INODE_REF, &req->wb_flags) > > 'nfs_lock_and_join_requests' also need the PG_HEADLOCK flag: > > nfs_lock_and_join_requests > nfs_page_group_lock(head); > test_and_clear_bit(PG_REMOVE, &head->wb_flags) > nfs_page_group_unlock(head); Fair enough, and the race between nfs_inode_remove_request() and nfs_lock_and_join_requests() with the setting of PG_REMOVE and the clearing on PG_INODE_REF is prevented by the presence of the PG_LOCK. OK. I agree with that argument. Thanks for the explanation! Cheers Trond
diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 85ca495..52cab65 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -786,7 +786,6 @@ static void nfs_inode_remove_request(struct nfs_page *req) struct nfs_inode *nfsi = NFS_I(inode); struct nfs_page *head; - atomic_long_dec(&nfsi->nrequests); if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) { head = req->wb_head; @@ -799,8 +798,10 @@ static void nfs_inode_remove_request(struct nfs_page *req) spin_unlock(&mapping->private_lock); } - if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) + if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) { nfs_release_request(req); + atomic_long_dec(&nfsi->nrequests); + } } static void
When xfstests testing, there are some WARNING as below: WARNING: CPU: 0 PID: 6235 at fs/nfs/inode.c:122 nfs_clear_inode+0x9c/0xd8 Modules linked in: CPU: 0 PID: 6235 Comm: umount.nfs Hardware name: linux,dummy-virt (DT) pstate: 60000005 (nZCv daif -PAN -UAO) pc : nfs_clear_inode+0x9c/0xd8 lr : nfs_evict_inode+0x60/0x78 sp : fffffc000f68fc00 x29: fffffc000f68fc00 x28: fffffe00c53155c0 x27: fffffe00c5315000 x26: fffffc0009a63748 x25: fffffc000f68fd18 x24: fffffc000bfaaf40 x23: fffffc000936d3c0 x22: fffffe00c4ff5e20 x21: fffffc000bfaaf40 x20: fffffe00c4ff5d10 x19: fffffc000c056000 x18: 000000000000003c x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000040 x14: 0000000000000228 x13: fffffc000c3a2000 x12: 0000000000000045 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : fffffc00084b027c x5 : fffffc0009a64000 x4 : fffffe00c0e77400 x3 : fffffc000c0563a8 x2 : fffffffffffffffb x1 : 000000000000764e x0 : 0000000000000001 Call trace: nfs_clear_inode+0x9c/0xd8 nfs_evict_inode+0x60/0x78 evict+0x108/0x380 dispose_list+0x70/0xa0 evict_inodes+0x194/0x210 generic_shutdown_super+0xb0/0x220 nfs_kill_super+0x40/0x88 deactivate_locked_super+0xb4/0x120 deactivate_super+0x144/0x160 cleanup_mnt+0x98/0x148 __cleanup_mnt+0x38/0x50 task_work_run+0x114/0x160 do_notify_resume+0x2f8/0x308 work_pending+0x8/0x14 The nrequest should be increased/decreased only if PG_INODE_REF flag was setted. But in the nfs_inode_remove_request function, it maybe decrease when no PG_INODE_REF flag, this maybe lead nrequests count error. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> --- fs/nfs/write.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)