diff mbox

[17/19] VFS: set PF_FSTRANS while namespace_sem is held.

Message ID 20140416155230.4d02e4b9@notabene.brown (mailing list archive)
State New, archived
Headers show

Commit Message

NeilBrown April 16, 2014, 5:52 a.m. UTC
On Wed, 16 Apr 2014 05:46:18 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Wed, Apr 16, 2014 at 02:03:37PM +1000, NeilBrown wrote:
> > namespace_sem can be taken while various i_mutex locks are held, so we
> > need to avoid reclaim from blocking on an FS (particularly loop-back
> > NFS).
> 
> I would really prefer to deal with that differently - by explicit change of
> gfp_t arguments of allocators.
> 
> The thing is, namespace_sem is held *only* over allocations, and not a lot
> of them, at that - only mnt_alloc_id(), mnt_alloc_group_id(), alloc_vfsmnt()
> and new_mountpoint().  That is all that is allowed.
> 
> Again, actual work with filesystems (setup, shutdown, remount, pathname
> resolution, etc.) is all done outside of namespace_sem; it's held only
> for manipulations of fs/{namespace,pnode}.c data structures and the only
> reason it isn't a spinlock is that we need to do some allocations.
> 
> So I'd rather slap GFP_NOFS on those few allocations...

So something like this?  I put that in to my testing instead.

Thanks,
NeilBrown

Comments

Al Viro April 16, 2014, 4:37 p.m. UTC | #1
On Wed, Apr 16, 2014 at 03:52:30PM +1000, NeilBrown wrote:

> So something like this?  I put that in to my testing instead.

Something like this, yes...  And TBH I would prefer the same approach
elsewhere - this kind of hidden state makes it hard to do any kind of
analysis.

Put it that way - the simplest situation is when the allocation flags
depend only on the call site.  Next one is when it's a function of
call chain - somewhat harder to review.  And the worst is when it's
a function of previous history of execution - not just the call chain,
but the things that had been called (and returned) prior to that one.

How many of those locations need to be of the third kind?  All fs/namespace.c
ones are of the first one...
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/namespace.c b/fs/namespace.c
index 83dcd5083dbb..8e103b8c8323 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -103,7 +103,7 @@  static int mnt_alloc_id(struct mount *mnt)
 	int res;
 
 retry:
-	ida_pre_get(&mnt_id_ida, GFP_KERNEL);
+	ida_pre_get(&mnt_id_ida, GFP_NOFS);
 	spin_lock(&mnt_id_lock);
 	res = ida_get_new_above(&mnt_id_ida, mnt_id_start, &mnt->mnt_id);
 	if (!res)
@@ -134,7 +134,7 @@  static int mnt_alloc_group_id(struct mount *mnt)
 {
 	int res;
 
-	if (!ida_pre_get(&mnt_group_ida, GFP_KERNEL))
+	if (!ida_pre_get(&mnt_group_ida, GFP_NOFS))
 		return -ENOMEM;
 
 	res = ida_get_new_above(&mnt_group_ida,
@@ -193,7 +193,7 @@  unsigned int mnt_get_count(struct mount *mnt)
 
 static struct mount *alloc_vfsmnt(const char *name)
 {
-	struct mount *mnt = kmem_cache_zalloc(mnt_cache, GFP_KERNEL);
+	struct mount *mnt = kmem_cache_zalloc(mnt_cache, GFP_NOFS);
 	if (mnt) {
 		int err;
 
@@ -202,7 +202,7 @@  static struct mount *alloc_vfsmnt(const char *name)
 			goto out_free_cache;
 
 		if (name) {
-			mnt->mnt_devname = kstrdup(name, GFP_KERNEL);
+			mnt->mnt_devname = kstrdup(name, GFP_NOFS);
 			if (!mnt->mnt_devname)
 				goto out_free_id;
 		}
@@ -682,7 +682,7 @@  static struct mountpoint *new_mountpoint(struct dentry *dentry)
 		}
 	}
 
-	mp = kmalloc(sizeof(struct mountpoint), GFP_KERNEL);
+	mp = kmalloc(sizeof(struct mountpoint), GFP_NOFS);
 	if (!mp)
 		return ERR_PTR(-ENOMEM);