Message ID | 1452047231-30569-1-git-send-email-tariq.x.saeed@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Tariq, On 01/06/2016 10:27 AM, Tariq Saeed wrote: > Orabug: 21793017 > > commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") > introduced this issue. ocfs2_setattr called by chmod command > holds cluster wide inode lock (Orabug 21685187) when calling > posix_acl_chmod. This latter function in turn calls ocfs2_iop_get_acl > and ocfs2_iop_set_acl. These two are also called directly from vfs > layer for getfacl/setfacl commands and therefore acquire the cluster wide > inode lock. If a remote conversion request comes after the first inode > lock in ocfs2_setattr, OCFS2_LOCK_BLOCKED will be set in l_flags. This > will cause the second call to inode lock from the ocfs2_iop_get_acl() > to block indefinetly. This fixed part of the deadlock. There is another part. ocfs2_mknod()->posix_acl_create()->ocfs2_iop_get_acl() Thanks, Junxiao. > > To solve this, we need to use nolock version of getacl. Since > nolock version of posix_acl_chmod does not exist, we restore a slightly > modified version of ocfs2_acl_chmod, which was removed in > commit 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure") > that uses nolock version of getacl. > > Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> > --- > fs/ocfs2/acl.c | 25 +++++++++++++++++++++++++ > fs/ocfs2/acl.h | 1 + > fs/ocfs2/file.c | 4 ++-- > 3 files changed, 28 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c > index 0cdf497..0fbd18c 100644 > --- a/fs/ocfs2/acl.c > +++ b/fs/ocfs2/acl.c > @@ -322,3 +322,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type) > brelse(di_bh); > return acl; > } > + > +int ocfs2_acl_chmod(struct inode *inode, struct buffer_head *bh) > +{ > + struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); > + struct posix_acl *acl; > + int ret; > + > + if (S_ISLNK(inode->i_mode)) > + return -EOPNOTSUPP; > + > + if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL)) > + return 0; > + > + acl = ocfs2_get_acl_nolock(inode, ACL_TYPE_ACCESS, bh); > + if (IS_ERR(acl) || !acl) > + return PTR_ERR(acl); > + ret = __posix_acl_chmod(&acl, GFP_KERNEL, inode->i_mode); > + if (ret) > + return ret; > + ret = ocfs2_set_acl(NULL, inode, NULL, ACL_TYPE_ACCESS, > + acl, NULL, NULL); > + posix_acl_release(acl); > + return ret; > +} > + > diff --git a/fs/ocfs2/acl.h b/fs/ocfs2/acl.h > index 3fce68d..035e587 100644 > --- a/fs/ocfs2/acl.h > +++ b/fs/ocfs2/acl.h > @@ -35,5 +35,6 @@ int ocfs2_set_acl(handle_t *handle, > struct posix_acl *acl, > struct ocfs2_alloc_context *meta_ac, > struct ocfs2_alloc_context *data_ac); > +extern int ocfs2_acl_chmod(struct inode *, struct buffer_head *); > > #endif /* OCFS2_ACL_H */ > diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c > index 0e5b451..77d30cb 100644 > --- a/fs/ocfs2/file.c > +++ b/fs/ocfs2/file.c > @@ -1268,20 +1268,20 @@ bail_unlock_rw: > if (size_change) > ocfs2_rw_unlock(inode, 1); > bail: > - brelse(bh); > > /* Release quota pointers in case we acquired them */ > for (qtype = 0; qtype < OCFS2_MAXQUOTAS; qtype++) > dqput(transfer_to[qtype]); > > if (!status && attr->ia_valid & ATTR_MODE) { > - status = posix_acl_chmod(inode, inode->i_mode); > + status = ocfs2_acl_chmod(inode, bh); > if (status < 0) > mlog_errno(status); > } > if (inode_locked) > ocfs2_inode_unlock(inode, 1); > > + brelse(bh); > return status; > } > >
On 01/05/2016 08:41 PM, Junxiao Bi wrote: > This fixed part of the deadlock. There is another part. > ocfs2_mknod()->posix_acl_create()->ocfs2_iop_get_acl() Right. It used to be: ocfs2_mknod -> ocfs2_init_acl -> ocfs2_get_acl_nolock. Thanks -Tariq
On 01/06/2016 06:51 PM, Tariq Saeed wrote: > On 01/05/2016 08:41 PM, Junxiao Bi wrote: >> This fixed part of the deadlock. There is another part. >> ocfs2_mknod()->posix_acl_create()->ocfs2_iop_get_acl() > yet another part. ocfs2_reflink -> ocfs2_init_security_and_acl -> ocfs2_iop_set_acl
On 01/06/2016 06:51 PM, Tariq Saeed wrote: > On 01/05/2016 08:41 PM, Junxiao Bi wrote: >> This fixed part of the deadlock. There is another part. >> ocfs2_mknod()->posix_acl_create()->ocfs2_iop_get_acl() > Right. It used to be: ocfs2_mknod -> ocfs2_init_acl -> > ocfs2_get_acl_nolock. > > Thanks > -Tariq > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel@oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
On Tue, Jan 05, 2016 at 06:27:11PM -0800, Tariq Saeed wrote: > Orabug: 21793017 > > commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") > introduced this issue. ocfs2_setattr called by chmod command > holds cluster wide inode lock (Orabug 21685187) when calling > posix_acl_chmod. This latter function in turn calls ocfs2_iop_get_acl > and ocfs2_iop_set_acl. These two are also called directly from vfs > layer for getfacl/setfacl commands and therefore acquire the cluster wide > inode lock. If a remote conversion request comes after the first inode > lock in ocfs2_setattr, OCFS2_LOCK_BLOCKED will be set in l_flags. This > will cause the second call to inode lock from the ocfs2_iop_get_acl() > to block indefinetly. > > To solve this, we need to use nolock version of getacl. Since > nolock version of posix_acl_chmod does not exist, we restore a slightly > modified version of ocfs2_acl_chmod, which was removed in > commit 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure") > that uses nolock version of getacl. This looks good, thanks Tariq. One comment about your description - you actually used the correct version of posix_acl_chmod() -- specifically __posix_acl_chmod(). It's not so much that there isn't a 'nolock' version of posix_acl_chmod(). It's more that we were using the higher level version which makes calls back into the fs. For most filesystems this is ok, so we save a lot of code by having the function do this. For filesystems like ocfs2 which have additional locking or other complexity it does not work, hence we directly call the function that does the non-vfs work (__posix_acl_chmod()) and replicate the small checks at the top fo the vfs function. So you could replace that last paragraph with something like this: The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which does not call back into the filesystem. Therefore, we restore ocfs2_acl_chmod() and use that instead. Reviewed-by: Mark Fasheh <mfasheh@suse.de> --Mark -- Mark Fasheh
On 01/07/2016 02:55 PM, Mark Fasheh wrote: > So you could replace that last paragraph with something like this: > > The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which > does not call back into the filesystem. Therefore, we restore > ocfs2_acl_chmod() and use that instead. Thanks for reviewing. I have two more code paths to fix. 1. ocfs2_mknod()->posix_acl_create()->ocfs2_iop_get_acl() 2. ocfs2_reflink -> ocfs2_init_security_and_acl -> ocfs2_iop_set_acl I will make your suggested change and submit again a 3-part patch, including the above two. -Tariq
On 01/07/2016 02:37 PM, Tariq Saeed wrote: > et another part. ocfs2_reflink -> ocfs2_init_security_and_acl -> > ocfs2_iop_set_acl On a closer look at the code, this is not a problem. Thanks, -Tariq
On Thu, Jan 07, 2016 at 03:49:27PM -0800, Tariq Saeed wrote: > > On 01/07/2016 02:55 PM, Mark Fasheh wrote: > >So you could replace that last paragraph with something like this: > > > >The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which > >does not call back into the filesystem. Therefore, we restore > >ocfs2_acl_chmod() and use that instead. > Thanks for reviewing. I have two more code paths to fix. No problem, thanks for the continuing patches. > 1. ocfs2_mknod()->posix_acl_create()->ocfs2_iop_get_acl() Ok, that seems straightforward enough. > 2. ocfs2_reflink -> ocfs2_init_security_and_acl -> ocfs2_iop_set_acl Could you elaborate for me on the problem you found there? Btw, ocfs2_iop_set_acl() isn't doing any cluster locking. That doesn't look right to me but maybe I'm missing something (like perhaps it gets called from lock context). I'll try to take a look tommorrow but since you've been looking around this area I thought I'd mention this to you. Thanks, --Mark -- Mark Fasheh
On 01/07/2016 05:45 PM, Mark Fasheh wrote: >> 2. ocfs2_reflink -> ocfs2_init_security_and_acl -> ocfs2_iop_set_acl > Could you elaborate for me on the problem you found there? The problem here is dropping dir lock in posix_acl_create after getting default_acl and acl. These two can be changed by another node by the time we get around to using them in ocfs2_init_security_and_acl. The old code in uek3 is mindful of this. In that code, ocfs2_init_security_and_acl gets the dir lock and keeps it until the acls of new node are set. Thanks -Tariq
On 01/08/2016 07:49 AM, Tariq Saeed wrote: > > On 01/07/2016 02:55 PM, Mark Fasheh wrote: >> So you could replace that last paragraph with something like this: >> >> The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which >> does not call back into the filesystem. Therefore, we restore >> ocfs2_acl_chmod() and use that instead. > Thanks for reviewing. I have two more code paths to fix. > > 1. ocfs2_mknod()->posix_acl_create()->ocfs2_iop_get_acl() > 2. ocfs2_reflink -> ocfs2_init_security_and_acl -> ocfs2_iop_set_acl Caught one more. ocfs2_orphan_filldir()->ocfs2_iget()->ocfs2_read_locked_inode(), inode open lock was locked. Later ocfs2_query_inode_wipe() locked the same open lock again! See detailed call trace(change my original recursive locking support patch a little to catch recursive locking): [240869.116872] (kworker/u30:1,1436,0):__ocfs2_cluster_lock:1563 ERROR: recursive locking rejected! [240869.122725] CPU: 0 PID: 1436 Comm: kworker/u30:1 Tainted: G O 4.4.0-rc8-next-20160105 #1 [240869.137262] Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014 [240869.148014] Workqueue: ocfs2_wq ocfs2_complete_recovery [ocfs2] [240869.159213] 0000000000000000 ffff88004b94f568 ffffffff81346cc4 0000000000000000 [240869.173066] 1000000000000800 ffff88003afe1b60 ffff88001b1bbc98 ffff88004b94f6d8 [240869.191668] ffffffffa00f78bc ffff88004b94f588 ffffffffa0104bed ffff88004b94f598 [240869.207480] Call Trace: [240869.212521] [<ffffffff81346cc4>] dump_stack+0x48/0x64 [240869.222182] [<ffffffffa00f78bc>] __ocfs2_cluster_lock+0x4ac/0x970 [ocfs2] [240869.235512] [<ffffffffa0104bed>] ? ocfs2_inode_cache_io_unlock+0xd/0x10 [ocfs2] [240869.249301] [<ffffffffa0143b24>] ? ocfs2_metadata_cache_io_unlock+0x14/0x30 [ocfs2] [240869.265103] [<ffffffffa00ebaae>] ? ocfs2_read_blocks+0x3be/0x5e0 [ocfs2] [240869.277703] [<ffffffff810a158e>] ? __wake_up+0x4e/0x70 [240869.287446] [<ffffffffa00f9c5e>] ocfs2_try_open_lock+0xfe/0x110 [ocfs2] [240869.300877] [<ffffffffa0106737>] ? ocfs2_query_inode_wipe+0xe7/0x250 [ocfs2] [240869.316895] [<ffffffffa0106737>] ocfs2_query_inode_wipe+0xe7/0x250 [ocfs2] [240869.329955] [<ffffffffa0107d7f>] ? ocfs2_evict_inode+0x13f/0x350 [ocfs2] [240869.342520] [<ffffffffa0107da5>] ocfs2_evict_inode+0x165/0x350 [ocfs2] [240869.354518] [<ffffffff810a11a0>] ? wake_atomic_t_function+0x30/0x30 [240869.366326] [<ffffffff811af043>] evict+0xd3/0x1c0 [240869.373057] [<ffffffffa0106f8f>] ? ocfs2_drop_inode+0x6f/0x100 [ocfs2] [240869.382059] [<ffffffff81346c50>] ? _atomic_dec_and_lock+0x50/0x70 [240869.390458] [<ffffffff811af39d>] iput+0x19d/0x210 [240869.397095] [<ffffffffa0109c2a>] ocfs2_orphan_filldir+0x9a/0x140 [ocfs2] [240869.409187] [<ffffffffa00f1866>] ocfs2_dir_foreach_blk+0x1b6/0x4d0 [ocfs2] [240869.423581] [<ffffffffa00f1ba4>] ocfs2_dir_foreach+0x24/0x30 [ocfs2] [240869.436807] [<ffffffffa0109a70>] ocfs2_queue_orphans+0x90/0x1b0 [ocfs2] [240869.450569] [<ffffffffa0109b90>] ? ocfs2_queue_orphans+0x1b0/0x1b0 [ocfs2] [240869.467103] [<ffffffffa010c099>] ocfs2_recover_orphans+0x49/0x420 [ocfs2] [240869.485250] [<ffffffff8108131f>] ? __queue_delayed_work+0x8f/0x190 [240869.497945] [<ffffffffa010c674>] ocfs2_complete_recovery+0x204/0x440 [ocfs2] [240869.512787] [<ffffffff8108105b>] ? pwq_dec_nr_in_flight+0x4b/0xa0 [240869.525181] [<ffffffff810819d8>] process_one_work+0x168/0x4c0 [240869.534732] [<ffffffff810c2244>] ? internal_add_timer+0x34/0x90 [240869.545003] [<ffffffff810c40d6>] ? mod_timer+0xf6/0x1b0 [240869.555854] [<ffffffff8191e6ab>] ? schedule+0x3b/0xa0 [240869.566496] [<ffffffff8191e671>] ? schedule+0x1/0xa0 [240869.576832] [<ffffffff810827b9>] worker_thread+0x159/0x6d0 [240869.588197] [<ffffffff8109b7c1>] ? put_prev_task_fair+0x21/0x40 [240869.600286] [<ffffffff8191df6f>] ? __schedule+0x38f/0x980 [240869.611697] [<ffffffff8108f41d>] ? default_wake_function+0xd/0x10 [240869.622461] [<ffffffff810a1031>] ? __wake_up_common+0x51/0x80 [240869.629365] [<ffffffff81082660>] ? create_worker+0x190/0x190 [240869.635999] [<ffffffff8191e6ab>] ? schedule+0x3b/0xa0 [240869.641966] [<ffffffff81082660>] ? create_worker+0x190/0x190 [240869.648574] [<ffffffff81086f37>] kthread+0xc7/0xe0 [240869.653436] [<ffffffff8108e8f9>] ? schedule_tail+0x19/0xb0 [240869.661235] [<ffffffff81086e70>] ? kthread_freezable_should_stop+0x70/0x70 [240869.675121] [<ffffffff8192224f>] ret_from_fork+0x3f/0x70 [240869.685850] [<ffffffff81086e70>] ? kthread_freezable_should_stop+0x70/0x70 [240869.700005] (kworker/u30:1,1436,0):ocfs2_query_inode_wipe:984 ERROR: status = -1 [240869.714660] (kworker/u30:1,1436,0):ocfs2_delete_inode:1085 ERROR: status = -1 Thanks, Junxiao. > > I will make your suggested change and submit again > a 3-part patch, including the above two. > -Tariq > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel@oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel >
Hi Mark, On 01/11/2016 11:17 AM, Junxiao Bi wrote: > On 01/08/2016 07:49 AM, Tariq Saeed wrote: >> >> On 01/07/2016 02:55 PM, Mark Fasheh wrote: >>> So you could replace that last paragraph with something like this: >>> >>> The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which >>> does not call back into the filesystem. Therefore, we restore >>> ocfs2_acl_chmod() and use that instead. >> Thanks for reviewing. I have two more code paths to fix. >> >> 1. ocfs2_mknod()->posix_acl_create()->ocfs2_iop_get_acl() >> 2. ocfs2_reflink -> ocfs2_init_security_and_acl -> ocfs2_iop_set_acl > Caught one more. > > ocfs2_orphan_filldir()->ocfs2_iget()->ocfs2_read_locked_inode(), inode > open lock was locked. Later ocfs2_query_inode_wipe() locked the same > open lock again! What's your idea about fixing the above recursive locking? It is recursive locking, but no deadlock since second one is using try locking. Thanks, Junxiao. > > See detailed call trace(change my original recursive locking support > patch a little to catch recursive locking): > > [240869.116872] (kworker/u30:1,1436,0):__ocfs2_cluster_lock:1563 ERROR: > recursive locking rejected! > [240869.122725] CPU: 0 PID: 1436 Comm: kworker/u30:1 Tainted: G > O 4.4.0-rc8-next-20160105 #1 > [240869.137262] Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014 > [240869.148014] Workqueue: ocfs2_wq ocfs2_complete_recovery [ocfs2] > [240869.159213] 0000000000000000 ffff88004b94f568 ffffffff81346cc4 > 0000000000000000 > [240869.173066] 1000000000000800 ffff88003afe1b60 ffff88001b1bbc98 > ffff88004b94f6d8 > [240869.191668] ffffffffa00f78bc ffff88004b94f588 ffffffffa0104bed > ffff88004b94f598 > [240869.207480] Call Trace: > [240869.212521] [<ffffffff81346cc4>] dump_stack+0x48/0x64 > [240869.222182] [<ffffffffa00f78bc>] __ocfs2_cluster_lock+0x4ac/0x970 > [ocfs2] > [240869.235512] [<ffffffffa0104bed>] ? > ocfs2_inode_cache_io_unlock+0xd/0x10 [ocfs2] > [240869.249301] [<ffffffffa0143b24>] ? > ocfs2_metadata_cache_io_unlock+0x14/0x30 [ocfs2] > [240869.265103] [<ffffffffa00ebaae>] ? ocfs2_read_blocks+0x3be/0x5e0 > [ocfs2] > [240869.277703] [<ffffffff810a158e>] ? __wake_up+0x4e/0x70 > [240869.287446] [<ffffffffa00f9c5e>] ocfs2_try_open_lock+0xfe/0x110 [ocfs2] > [240869.300877] [<ffffffffa0106737>] ? > ocfs2_query_inode_wipe+0xe7/0x250 [ocfs2] > [240869.316895] [<ffffffffa0106737>] ocfs2_query_inode_wipe+0xe7/0x250 > [ocfs2] > [240869.329955] [<ffffffffa0107d7f>] ? ocfs2_evict_inode+0x13f/0x350 > [ocfs2] > [240869.342520] [<ffffffffa0107da5>] ocfs2_evict_inode+0x165/0x350 [ocfs2] > [240869.354518] [<ffffffff810a11a0>] ? wake_atomic_t_function+0x30/0x30 > [240869.366326] [<ffffffff811af043>] evict+0xd3/0x1c0 > [240869.373057] [<ffffffffa0106f8f>] ? ocfs2_drop_inode+0x6f/0x100 [ocfs2] > [240869.382059] [<ffffffff81346c50>] ? _atomic_dec_and_lock+0x50/0x70 > [240869.390458] [<ffffffff811af39d>] iput+0x19d/0x210 > [240869.397095] [<ffffffffa0109c2a>] ocfs2_orphan_filldir+0x9a/0x140 > [ocfs2] > [240869.409187] [<ffffffffa00f1866>] ocfs2_dir_foreach_blk+0x1b6/0x4d0 > [ocfs2] > [240869.423581] [<ffffffffa00f1ba4>] ocfs2_dir_foreach+0x24/0x30 [ocfs2] > [240869.436807] [<ffffffffa0109a70>] ocfs2_queue_orphans+0x90/0x1b0 [ocfs2] > [240869.450569] [<ffffffffa0109b90>] ? ocfs2_queue_orphans+0x1b0/0x1b0 > [ocfs2] > [240869.467103] [<ffffffffa010c099>] ocfs2_recover_orphans+0x49/0x420 > [ocfs2] > [240869.485250] [<ffffffff8108131f>] ? __queue_delayed_work+0x8f/0x190 > [240869.497945] [<ffffffffa010c674>] > ocfs2_complete_recovery+0x204/0x440 [ocfs2] > [240869.512787] [<ffffffff8108105b>] ? pwq_dec_nr_in_flight+0x4b/0xa0 > [240869.525181] [<ffffffff810819d8>] process_one_work+0x168/0x4c0 > [240869.534732] [<ffffffff810c2244>] ? internal_add_timer+0x34/0x90 > [240869.545003] [<ffffffff810c40d6>] ? mod_timer+0xf6/0x1b0 > [240869.555854] [<ffffffff8191e6ab>] ? schedule+0x3b/0xa0 > [240869.566496] [<ffffffff8191e671>] ? schedule+0x1/0xa0 > [240869.576832] [<ffffffff810827b9>] worker_thread+0x159/0x6d0 > [240869.588197] [<ffffffff8109b7c1>] ? put_prev_task_fair+0x21/0x40 > [240869.600286] [<ffffffff8191df6f>] ? __schedule+0x38f/0x980 > [240869.611697] [<ffffffff8108f41d>] ? default_wake_function+0xd/0x10 > [240869.622461] [<ffffffff810a1031>] ? __wake_up_common+0x51/0x80 > [240869.629365] [<ffffffff81082660>] ? create_worker+0x190/0x190 > [240869.635999] [<ffffffff8191e6ab>] ? schedule+0x3b/0xa0 > [240869.641966] [<ffffffff81082660>] ? create_worker+0x190/0x190 > [240869.648574] [<ffffffff81086f37>] kthread+0xc7/0xe0 > [240869.653436] [<ffffffff8108e8f9>] ? schedule_tail+0x19/0xb0 > [240869.661235] [<ffffffff81086e70>] ? > kthread_freezable_should_stop+0x70/0x70 > [240869.675121] [<ffffffff8192224f>] ret_from_fork+0x3f/0x70 > [240869.685850] [<ffffffff81086e70>] ? > kthread_freezable_should_stop+0x70/0x70 > [240869.700005] (kworker/u30:1,1436,0):ocfs2_query_inode_wipe:984 ERROR: > status = -1 > [240869.714660] (kworker/u30:1,1436,0):ocfs2_delete_inode:1085 ERROR: > status = -1 > > Thanks, > Junxiao. >> >> I will make your suggested change and submit again >> a 3-part patch, including the above two. >> -Tariq >> >> >> _______________________________________________ >> Ocfs2-devel mailing list >> Ocfs2-devel@oss.oracle.com >> https://oss.oracle.com/mailman/listinfo/ocfs2-devel >> >
diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c index 0cdf497..0fbd18c 100644 --- a/fs/ocfs2/acl.c +++ b/fs/ocfs2/acl.c @@ -322,3 +322,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type) brelse(di_bh); return acl; } + +int ocfs2_acl_chmod(struct inode *inode, struct buffer_head *bh) +{ + struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); + struct posix_acl *acl; + int ret; + + if (S_ISLNK(inode->i_mode)) + return -EOPNOTSUPP; + + if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL)) + return 0; + + acl = ocfs2_get_acl_nolock(inode, ACL_TYPE_ACCESS, bh); + if (IS_ERR(acl) || !acl) + return PTR_ERR(acl); + ret = __posix_acl_chmod(&acl, GFP_KERNEL, inode->i_mode); + if (ret) + return ret; + ret = ocfs2_set_acl(NULL, inode, NULL, ACL_TYPE_ACCESS, + acl, NULL, NULL); + posix_acl_release(acl); + return ret; +} + diff --git a/fs/ocfs2/acl.h b/fs/ocfs2/acl.h index 3fce68d..035e587 100644 --- a/fs/ocfs2/acl.h +++ b/fs/ocfs2/acl.h @@ -35,5 +35,6 @@ int ocfs2_set_acl(handle_t *handle, struct posix_acl *acl, struct ocfs2_alloc_context *meta_ac, struct ocfs2_alloc_context *data_ac); +extern int ocfs2_acl_chmod(struct inode *, struct buffer_head *); #endif /* OCFS2_ACL_H */ diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 0e5b451..77d30cb 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1268,20 +1268,20 @@ bail_unlock_rw: if (size_change) ocfs2_rw_unlock(inode, 1); bail: - brelse(bh); /* Release quota pointers in case we acquired them */ for (qtype = 0; qtype < OCFS2_MAXQUOTAS; qtype++) dqput(transfer_to[qtype]); if (!status && attr->ia_valid & ATTR_MODE) { - status = posix_acl_chmod(inode, inode->i_mode); + status = ocfs2_acl_chmod(inode, bh); if (status < 0) mlog_errno(status); } if (inode_locked) ocfs2_inode_unlock(inode, 1); + brelse(bh); return status; }
Orabug: 21793017 commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") introduced this issue. ocfs2_setattr called by chmod command holds cluster wide inode lock (Orabug 21685187) when calling posix_acl_chmod. This latter function in turn calls ocfs2_iop_get_acl and ocfs2_iop_set_acl. These two are also called directly from vfs layer for getfacl/setfacl commands and therefore acquire the cluster wide inode lock. If a remote conversion request comes after the first inode lock in ocfs2_setattr, OCFS2_LOCK_BLOCKED will be set in l_flags. This will cause the second call to inode lock from the ocfs2_iop_get_acl() to block indefinetly. To solve this, we need to use nolock version of getacl. Since nolock version of posix_acl_chmod does not exist, we restore a slightly modified version of ocfs2_acl_chmod, which was removed in commit 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure") that uses nolock version of getacl. Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> --- fs/ocfs2/acl.c | 25 +++++++++++++++++++++++++ fs/ocfs2/acl.h | 1 + fs/ocfs2/file.c | 4 ++-- 3 files changed, 28 insertions(+), 2 deletions(-)