diff mbox

Inconsistency when mounting a directory that 'world' cannot access.

Message ID 20121003134629.72557522@notabene.brown (mailing list archive)
State New, archived
Headers show

Commit Message

NeilBrown Oct. 3, 2012, 3:46 a.m. UTC
On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
wrote:

> I guess you're right.  So it starts to sound more like: "you have a
> confusing setup.  Your export configuration says one thing, and your
> filesystem permissions say another.  Under NFSv3 the confusion didn't
> matter, but now it does--time to fix it."
> 

That's the best I could come to - I'm glad to have it confirmed.  Thanks!

It is unfortunate that Linux NFS uses an anon credential to mount when krb5
is in use, and uses 'root' when auth_sys is used (which might be anon if
"root_squash" is active, but might not).
I wonder if it would work to use auth_none for the mount-time lookup, just
for consistency..

Is the following appropriate?  Is there somewhere better to put this caveat?

Thanks,
NeilBrown

Comments

J. Bruce Fields Oct. 3, 2012, 3:13 p.m. UTC | #1
On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
> 
> > I guess you're right.  So it starts to sound more like: "you have a
> > confusing setup.  Your export configuration says one thing, and your
> > filesystem permissions say another.  Under NFSv3 the confusion didn't
> > matter, but now it does--time to fix it."
> > 
> 
> That's the best I could come to - I'm glad to have it confirmed.  Thanks!
> 
> It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> is in use, and uses 'root' when auth_sys is used (which might be anon if
> "root_squash" is active, but might not).
> I wonder if it would work to use auth_none for the mount-time lookup, just
> for consistency..
> 
> Is the following appropriate?  Is there somewhere better to put this caveat?

Unfortunately, it's more complicated than this, as it depends on client
implementation and configuration details.

Something like this would be more accurate but possibly too long:

	Note that under NFSv2 and NFSv3, the mount path is traversed by
	mountd acting as root, but under NFSv4 the mount path is looked
	up using the client's credentials.  This means that, for
	example, if a client mounts using a krb5 credential that the
	server maps to an "anonmyous" user, then the mount will only
	succeed if that directory and all its parents allow eXecute
	permissions.

--b.

> 
> Thanks,
> NeilBrown
> 
> 
> diff --git a/utils/exportfs/exports.man b/utils/exportfs/exports.man
> index bc1de73..91e4b9c 100644
> --- a/utils/exportfs/exports.man
> +++ b/utils/exportfs/exports.man
> @@ -126,6 +126,10 @@ will be enforced only for access using flavors listed in the immediately
>  preceding sec= option.  The only options that are permitted to vary in
>  this way are ro, rw, no_root_squash, root_squash, and all_squash.
>  .PP
> +When RPCSEC_GSS is used with NFSv4, a client will only be able to mount a
> +directory if that directory and all its ancestors give eXecute access
> +to "world".
> +.PP
>  .SS General Options
>  .BR exportfs
>  understands the following export options:


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Trond Myklebust Oct. 3, 2012, 3:48 p.m. UTC | #2
On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:

> > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>

> > wrote:

> > 

> > > I guess you're right.  So it starts to sound more like: "you have a

> > > confusing setup.  Your export configuration says one thing, and your

> > > filesystem permissions say another.  Under NFSv3 the confusion didn't

> > > matter, but now it does--time to fix it."

> > > 

> > 

> > That's the best I could come to - I'm glad to have it confirmed.  Thanks!

> > 

> > It is unfortunate that Linux NFS uses an anon credential to mount when krb5

> > is in use, and uses 'root' when auth_sys is used (which might be anon if

> > "root_squash" is active, but might not).

> > I wonder if it would work to use auth_none for the mount-time lookup, just

> > for consistency..

> > 

> > Is the following appropriate?  Is there somewhere better to put this caveat?

> 

> Unfortunately, it's more complicated than this, as it depends on client

> implementation and configuration details.

> 

> Something like this would be more accurate but possibly too long:

> 

> 	Note that under NFSv2 and NFSv3, the mount path is traversed by

> 	mountd acting as root, but under NFSv4 the mount path is looked

> 	up using the client's credentials.  This means that, for

> 	example, if a client mounts using a krb5 credential that the

> 	server maps to an "anonmyous" user, then the mount will only

> 	succeed if that directory and all its parents allow eXecute

> 	permissions.


So you're listing this as a "feature" rather than a bug? There should be
no reason to constrain the pseudofs to use the permission checks from
the underlying filesystem.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com
J. Bruce Fields Oct. 3, 2012, 4:27 p.m. UTC | #3
On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> > On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> > > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> > > wrote:
> > > 
> > > > I guess you're right.  So it starts to sound more like: "you have a
> > > > confusing setup.  Your export configuration says one thing, and your
> > > > filesystem permissions say another.  Under NFSv3 the confusion didn't
> > > > matter, but now it does--time to fix it."
> > > > 
> > > 
> > > That's the best I could come to - I'm glad to have it confirmed.  Thanks!
> > > 
> > > It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> > > is in use, and uses 'root' when auth_sys is used (which might be anon if
> > > "root_squash" is active, but might not).
> > > I wonder if it would work to use auth_none for the mount-time lookup, just
> > > for consistency..
> > > 
> > > Is the following appropriate?  Is there somewhere better to put this caveat?
> > 
> > Unfortunately, it's more complicated than this, as it depends on client
> > implementation and configuration details.
> > 
> > Something like this would be more accurate but possibly too long:
> > 
> > 	Note that under NFSv2 and NFSv3, the mount path is traversed by
> > 	mountd acting as root, but under NFSv4 the mount path is looked
> > 	up using the client's credentials.  This means that, for
> > 	example, if a client mounts using a krb5 credential that the
> > 	server maps to an "anonmyous" user, then the mount will only
> > 	succeed if that directory and all its parents allow eXecute
> > 	permissions.
> 
> So you're listing this as a "feature" rather than a bug? There should be
> no reason to constrain the pseudofs to use the permission checks from
> the underlying filesystem.

I'd be fine with that.

(That still leaves some subtle v3/v4 difference in the case of mount
paths underneath an export?

What *is* the existing mountd behavior there, exactly?  I'm inclined to
think allowing mounts of arbitrary subdirectories is a bug, but maybe
there's some historical reason for it or maybe someone already depends
on it.)

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
NeilBrown Oct. 3, 2012, 10:46 p.m. UTC | #4
On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <bfields@fieldses.org>
wrote:

> On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> > On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> > > On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> > > > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> > > > wrote:
> > > > 
> > > > > I guess you're right.  So it starts to sound more like: "you have a
> > > > > confusing setup.  Your export configuration says one thing, and your
> > > > > filesystem permissions say another.  Under NFSv3 the confusion didn't
> > > > > matter, but now it does--time to fix it."
> > > > > 
> > > > 
> > > > That's the best I could come to - I'm glad to have it confirmed.  Thanks!
> > > > 
> > > > It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> > > > is in use, and uses 'root' when auth_sys is used (which might be anon if
> > > > "root_squash" is active, but might not).
> > > > I wonder if it would work to use auth_none for the mount-time lookup, just
> > > > for consistency..
> > > > 
> > > > Is the following appropriate?  Is there somewhere better to put this caveat?
> > > 
> > > Unfortunately, it's more complicated than this, as it depends on client
> > > implementation and configuration details.
> > > 
> > > Something like this would be more accurate but possibly too long:
> > > 
> > > 	Note that under NFSv2 and NFSv3, the mount path is traversed by
> > > 	mountd acting as root, but under NFSv4 the mount path is looked
> > > 	up using the client's credentials.  This means that, for
> > > 	example, if a client mounts using a krb5 credential that the
> > > 	server maps to an "anonmyous" user, then the mount will only
> > > 	succeed if that directory and all its parents allow eXecute
> > > 	permissions.
> > 
> > So you're listing this as a "feature" rather than a bug? There should be
> > no reason to constrain the pseudofs to use the permission checks from
> > the underlying filesystem.
> 
> I'd be fine with that.
> 
> (That still leaves some subtle v3/v4 difference in the case of mount
> paths underneath an export?
> 
> What *is* the existing mountd behavior there, exactly?  I'm inclined to
> think allowing mounts of arbitrary subdirectories is a bug, but maybe
> there's some historical reason for it or maybe someone already depends
> on it.)
> 
> --b.

The behaviour is simple that you mount a filehandle (typically belonging to a
directory) and that filehandle can be anything inside any exported filesystem.
Yes, please do depend on being able to mount filehandles that aren't to root
of a filesystem.

The case the brought this issue to my attention involved the server having
a directory containing hundreds of home directories.  This directory is
exported.

If they mount that top level directory they get horrible performance.  If
they use an automounter to just mount the homes that are accessed it works
better.  They weren't able to explain why but my guess is that some tools
(GUI filesystem browser) would occasionally do the equivalent of "ls  -l" of
the top level directory which would hammer nfs-idmapd and probably ldap....
though you would think that would get cached and not be a problem for long.
So maybe it is more subtle than that.

I've built similar setups before.  There is something attractive about
everyone's home directory being /home/$USERNAME even though they are on
different servers and different filesystems.

In the particular problem scenario, local policy requires that the 'staff'
directory on the server to not be world-accessible, but they still want to
mount the individual home directories from there onto client machines as
required.
I cannot easily justify that policy, but the point is that it works with
NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5.  I don't
think we can fix this inconsistency but maybe we can explain it.

I think your text is more accurate than mine, but also a little more vague so
the important may not be immediately obvious.  That might be a price we have
to pay for accuracy.

Thanks,
NeilBrown
J. Bruce Fields Oct. 4, 2012, 4:07 p.m. UTC | #5
On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
> On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
> 
> > On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> > > On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> > > > On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> > > > > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> > > > > wrote:
> > > > > 
> > > > > > I guess you're right.  So it starts to sound more like: "you have a
> > > > > > confusing setup.  Your export configuration says one thing, and your
> > > > > > filesystem permissions say another.  Under NFSv3 the confusion didn't
> > > > > > matter, but now it does--time to fix it."
> > > > > > 
> > > > > 
> > > > > That's the best I could come to - I'm glad to have it confirmed.  Thanks!
> > > > > 
> > > > > It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> > > > > is in use, and uses 'root' when auth_sys is used (which might be anon if
> > > > > "root_squash" is active, but might not).
> > > > > I wonder if it would work to use auth_none for the mount-time lookup, just
> > > > > for consistency..
> > > > > 
> > > > > Is the following appropriate?  Is there somewhere better to put this caveat?
> > > > 
> > > > Unfortunately, it's more complicated than this, as it depends on client
> > > > implementation and configuration details.
> > > > 
> > > > Something like this would be more accurate but possibly too long:
> > > > 
> > > > 	Note that under NFSv2 and NFSv3, the mount path is traversed by
> > > > 	mountd acting as root, but under NFSv4 the mount path is looked
> > > > 	up using the client's credentials.  This means that, for
> > > > 	example, if a client mounts using a krb5 credential that the
> > > > 	server maps to an "anonmyous" user, then the mount will only
> > > > 	succeed if that directory and all its parents allow eXecute
> > > > 	permissions.
> > > 
> > > So you're listing this as a "feature" rather than a bug? There should be
> > > no reason to constrain the pseudofs to use the permission checks from
> > > the underlying filesystem.
> > 
> > I'd be fine with that.
> > 
> > (That still leaves some subtle v3/v4 difference in the case of mount
> > paths underneath an export?
> > 
> > What *is* the existing mountd behavior there, exactly?  I'm inclined to
> > think allowing mounts of arbitrary subdirectories is a bug, but maybe
> > there's some historical reason for it or maybe someone already depends
> > on it.)
> > 
> > --b.
> 
> The behaviour is simple that you mount a filehandle (typically belonging to a
> directory) and that filehandle can be anything inside any exported filesystem.

It's not the nfsd behavior that bothers me--there's nothing we can do
about the fact that access by filehandle can bypass directory
permissions.

What bothers is that mountd will apparently allow anyone to do a lookup
anywhere in an exported filesystem.

I don't know--maybe I shouldn't be so concerned about the possibility a
rogue user could figure out that my "Music" directory includes an
unreasonable number of Miles Davis titles.

> Yes, please do depend on being able to mount filehandles that aren't to root
> of a filesystem.
> 
> The case the brought this issue to my attention involved the server having
> a directory containing hundreds of home directories.  This directory is
> exported.
> 
> If they mount that top level directory they get horrible performance.  If
> they use an automounter to just mount the homes that are accessed it works
> better.  They weren't able to explain why but my guess is that some tools
> (GUI filesystem browser) would occasionally do the equivalent of "ls  -l" of
> the top level directory which would hammer nfs-idmapd and probably ldap....
> though you would think that would get cached and not be a problem for long.
> So maybe it is more subtle than that.

Getting all the id->name mappings for a 100-entry directory is going to
require a 100 serialized upcalls to idmapd (and then possibly ldap), and
by default it looks like the idmapd cache will go cold after 10
minutes....  Not hard to imagine that could be a problem.

Running multiple idmapd process would be easy and might help?  Though
not if the client's just giving us the getattrs one at a time.

Or maybe the problem's somewhere else entirely, but that's a real bug if
we aren't giving good performance on /home.

--b.

> I've built similar setups before.  There is something attractive about
> everyone's home directory being /home/$USERNAME even though they are on
> different servers and different filesystems.
> 
> In the particular problem scenario, local policy requires that the 'staff'
> directory on the server to not be world-accessible, but they still want to
> mount the individual home directories from there onto client machines as
> required.
> I cannot easily justify that policy, but the point is that it works with
> NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5.  I don't
> think we can fix this inconsistency but maybe we can explain it.
> 
> I think your text is more accurate than mine, but also a little more vague so
> the important may not be immediately obvious.  That might be a price we have
> to pay for accuracy.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
NeilBrown Oct. 8, 2012, 6:03 a.m. UTC | #6
On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <bfields@fieldses.org>
wrote:

> On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
> > On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> > wrote:
> > 
> > > On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> > > > On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> > > > > On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> > > > > > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> > > > > > wrote:
> > > > > > 
> > > > > > > I guess you're right.  So it starts to sound more like: "you have a
> > > > > > > confusing setup.  Your export configuration says one thing, and your
> > > > > > > filesystem permissions say another.  Under NFSv3 the confusion didn't
> > > > > > > matter, but now it does--time to fix it."
> > > > > > > 
> > > > > > 
> > > > > > That's the best I could come to - I'm glad to have it confirmed.  Thanks!
> > > > > > 
> > > > > > It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> > > > > > is in use, and uses 'root' when auth_sys is used (which might be anon if
> > > > > > "root_squash" is active, but might not).
> > > > > > I wonder if it would work to use auth_none for the mount-time lookup, just
> > > > > > for consistency..
> > > > > > 
> > > > > > Is the following appropriate?  Is there somewhere better to put this caveat?
> > > > > 
> > > > > Unfortunately, it's more complicated than this, as it depends on client
> > > > > implementation and configuration details.
> > > > > 
> > > > > Something like this would be more accurate but possibly too long:
> > > > > 
> > > > > 	Note that under NFSv2 and NFSv3, the mount path is traversed by
> > > > > 	mountd acting as root, but under NFSv4 the mount path is looked
> > > > > 	up using the client's credentials.  This means that, for
> > > > > 	example, if a client mounts using a krb5 credential that the
> > > > > 	server maps to an "anonmyous" user, then the mount will only
> > > > > 	succeed if that directory and all its parents allow eXecute
> > > > > 	permissions.
> > > > 
> > > > So you're listing this as a "feature" rather than a bug? There should be
> > > > no reason to constrain the pseudofs to use the permission checks from
> > > > the underlying filesystem.
> > > 
> > > I'd be fine with that.
> > > 
> > > (That still leaves some subtle v3/v4 difference in the case of mount
> > > paths underneath an export?
> > > 
> > > What *is* the existing mountd behavior there, exactly?  I'm inclined to
> > > think allowing mounts of arbitrary subdirectories is a bug, but maybe
> > > there's some historical reason for it or maybe someone already depends
> > > on it.)
> > > 
> > > --b.
> > 
> > The behaviour is simple that you mount a filehandle (typically belonging to a
> > directory) and that filehandle can be anything inside any exported filesystem.
> 
> It's not the nfsd behavior that bothers me--there's nothing we can do
> about the fact that access by filehandle can bypass directory
> permissions.
> 
> What bothers is that mountd will apparently allow anyone to do a lookup
> anywhere in an exported filesystem.

Not anyone - it requires a privileged source port from a known host.
So it is only "anyone who can get 'root'".

> 
> I don't know--maybe I shouldn't be so concerned about the possibility a
> rogue user could figure out that my "Music" directory includes an
> unreasonable number of Miles Davis titles.
> 
> > Yes, please do depend on being able to mount filehandles that aren't to root
> > of a filesystem.
> > 
> > The case the brought this issue to my attention involved the server having
> > a directory containing hundreds of home directories.  This directory is
> > exported.
> > 
> > If they mount that top level directory they get horrible performance.  If
> > they use an automounter to just mount the homes that are accessed it works
> > better.  They weren't able to explain why but my guess is that some tools
> > (GUI filesystem browser) would occasionally do the equivalent of "ls  -l" of
> > the top level directory which would hammer nfs-idmapd and probably ldap....
> > though you would think that would get cached and not be a problem for long.
> > So maybe it is more subtle than that.
> 
> Getting all the id->name mappings for a 100-entry directory is going to
> require a 100 serialized upcalls to idmapd (and then possibly ldap), and
> by default it looks like the idmapd cache will go cold after 10
> minutes....  Not hard to imagine that could be a problem.
> 
> Running multiple idmapd process would be easy and might help?  Though
> not if the client's just giving us the getattrs one at a time.
> 
> Or maybe the problem's somewhere else entirely, but that's a real bug if
> we aren't giving good performance on /home.

I did some experimenting..
On both 'client' and 'server':
  for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
>> /etc/passwd

On server in suitable directory

  for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done

Mount that directory onto the client with NFSv3 and "time ls -l" takes a
little under 4 seconds.
Mount with NFSv4 and it takes about the same.  However:

.....
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
....


tcpdump shows the server is returning the write stuff, but something if going
wrong on the client.  I've tried unmounting/remounting and killing/restarting
rpc.idmapd.
I had some config problems previously .. is there any chance that these
unknown entries are in a cache?  Any easy way to view or flush the cache?

Of course this is with text-file password lookup.  LDAP might be slower but
I'd be surprised if it was much slower.

NeilBrown



> 
> --b.
> 
> > I've built similar setups before.  There is something attractive about
> > everyone's home directory being /home/$USERNAME even though they are on
> > different servers and different filesystems.
> > 
> > In the particular problem scenario, local policy requires that the 'staff'
> > directory on the server to not be world-accessible, but they still want to
> > mount the individual home directories from there onto client machines as
> > required.
> > I cannot easily justify that policy, but the point is that it works with
> > NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5.  I don't
> > think we can fix this inconsistency but maybe we can explain it.
> > 
> > I think your text is more accurate than mine, but also a little more vague so
> > the important may not be immediately obvious.  That might be a price we have
> > to pay for accuracy.
Steve Dickson Oct. 8, 2012, 11:42 a.m. UTC | #7
On 08/10/12 02:03, NeilBrown wrote:
> On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
> 
>> On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
>>> On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <bfields@fieldses.org>
>>> wrote:
>>>
>>>> On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
>>>>> On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
>>>>>> On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
>>>>>>> On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I guess you're right.  So it starts to sound more like: "you have a
>>>>>>>> confusing setup.  Your export configuration says one thing, and your
>>>>>>>> filesystem permissions say another.  Under NFSv3 the confusion didn't
>>>>>>>> matter, but now it does--time to fix it."
>>>>>>>>
>>>>>>>
>>>>>>> That's the best I could come to - I'm glad to have it confirmed.  Thanks!
>>>>>>>
>>>>>>> It is unfortunate that Linux NFS uses an anon credential to mount when krb5
>>>>>>> is in use, and uses 'root' when auth_sys is used (which might be anon if
>>>>>>> "root_squash" is active, but might not).
>>>>>>> I wonder if it would work to use auth_none for the mount-time lookup, just
>>>>>>> for consistency..
>>>>>>>
>>>>>>> Is the following appropriate?  Is there somewhere better to put this caveat?
>>>>>>
>>>>>> Unfortunately, it's more complicated than this, as it depends on client
>>>>>> implementation and configuration details.
>>>>>>
>>>>>> Something like this would be more accurate but possibly too long:
>>>>>>
>>>>>> 	Note that under NFSv2 and NFSv3, the mount path is traversed by
>>>>>> 	mountd acting as root, but under NFSv4 the mount path is looked
>>>>>> 	up using the client's credentials.  This means that, for
>>>>>> 	example, if a client mounts using a krb5 credential that the
>>>>>> 	server maps to an "anonmyous" user, then the mount will only
>>>>>> 	succeed if that directory and all its parents allow eXecute
>>>>>> 	permissions.
>>>>>
>>>>> So you're listing this as a "feature" rather than a bug? There should be
>>>>> no reason to constrain the pseudofs to use the permission checks from
>>>>> the underlying filesystem.
>>>>
>>>> I'd be fine with that.
>>>>
>>>> (That still leaves some subtle v3/v4 difference in the case of mount
>>>> paths underneath an export?
>>>>
>>>> What *is* the existing mountd behavior there, exactly?  I'm inclined to
>>>> think allowing mounts of arbitrary subdirectories is a bug, but maybe
>>>> there's some historical reason for it or maybe someone already depends
>>>> on it.)
>>>>
>>>> --b.
>>>
>>> The behaviour is simple that you mount a filehandle (typically belonging to a
>>> directory) and that filehandle can be anything inside any exported filesystem.
>>
>> It's not the nfsd behavior that bothers me--there's nothing we can do
>> about the fact that access by filehandle can bypass directory
>> permissions.
>>
>> What bothers is that mountd will apparently allow anyone to do a lookup
>> anywhere in an exported filesystem.
> 
> Not anyone - it requires a privileged source port from a known host.
> So it is only "anyone who can get 'root'".
> 
>>
>> I don't know--maybe I shouldn't be so concerned about the possibility a
>> rogue user could figure out that my "Music" directory includes an
>> unreasonable number of Miles Davis titles.
>>
>>> Yes, please do depend on being able to mount filehandles that aren't to root
>>> of a filesystem.
>>>
>>> The case the brought this issue to my attention involved the server having
>>> a directory containing hundreds of home directories.  This directory is
>>> exported.
>>>
>>> If they mount that top level directory they get horrible performance.  If
>>> they use an automounter to just mount the homes that are accessed it works
>>> better.  They weren't able to explain why but my guess is that some tools
>>> (GUI filesystem browser) would occasionally do the equivalent of "ls  -l" of
>>> the top level directory which would hammer nfs-idmapd and probably ldap....
>>> though you would think that would get cached and not be a problem for long.
>>> So maybe it is more subtle than that.
>>
>> Getting all the id->name mappings for a 100-entry directory is going to
>> require a 100 serialized upcalls to idmapd (and then possibly ldap), and
>> by default it looks like the idmapd cache will go cold after 10
>> minutes....  Not hard to imagine that could be a problem.
>>
>> Running multiple idmapd process would be easy and might help?  Though
>> not if the client's just giving us the getattrs one at a time.
>>
>> Or maybe the problem's somewhere else entirely, but that's a real bug if
>> we aren't giving good performance on /home.
> 
> I did some experimenting..
> On both 'client' and 'server':
>   for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
>>> /etc/passwd
> 
> On server in suitable directory
> 
>   for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done
> 
> Mount that directory onto the client with NFSv3 and "time ls -l" takes a
> little under 4 seconds.
> Mount with NFSv4 and it takes about the same.  However:
> 
> .....
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
> drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
> drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
> ....
> 
> 
> tcpdump shows the server is returning the write stuff, but something if going
> wrong on the client.  I've tried unmounting/remounting and killing/restarting
> rpc.idmapd.
> I had some config problems previously .. is there any chance that these
> unknown entries are in a cache?  Any easy way to view or flush the cache?
Assuming you are using the keyring based idmapper, "nfsidmap -cv" will
clear the keyring of user and group ids. See nfsidmap(5).

If you using rpc.idmapd, I believe 
    echo `date +'%s'` > /proc/net/rpc/nfs4.idtoname/flush
will do the trick.... The CITI faq 
    http://www.citi.umich.edu/projects/nfsv4/linux/faq/
has a section on work with this cache...

steved.

> 
> Of course this is with text-file password lookup.  LDAP might be slower but
> I'd be surprised if it was much slower.
> 
> NeilBrown
> 
> 
> 
>>
>> --b.
>>
>>> I've built similar setups before.  There is something attractive about
>>> everyone's home directory being /home/$USERNAME even though they are on
>>> different servers and different filesystems.
>>>
>>> In the particular problem scenario, local policy requires that the 'staff'
>>> directory on the server to not be world-accessible, but they still want to
>>> mount the individual home directories from there onto client machines as
>>> required.
>>> I cannot easily justify that policy, but the point is that it works with
>>> NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5.  I don't
>>> think we can fix this inconsistency but maybe we can explain it.
>>>
>>> I think your text is more accurate than mine, but also a little more vague so
>>> the important may not be immediately obvious.  That might be a price we have
>>> to pay for accuracy.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Oct. 8, 2012, 12:19 p.m. UTC | #8
On Mon, Oct 08, 2012 at 05:03:04PM +1100, NeilBrown wrote:
> On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
> > It's not the nfsd behavior that bothers me--there's nothing we can do
> > about the fact that access by filehandle can bypass directory
> > permissions.
> > 
> > What bothers is that mountd will apparently allow anyone to do a lookup
> > anywhere in an exported filesystem.
> 
> Not anyone - it requires a privileged source port from a known host.
> So it is only "anyone who can get 'root'".

As you know, that's not necessarily a good asumption.  And if somebody's
using sec=krb5, they're explicitly saying that they don't trust that
assumption.

> > Getting all the id->name mappings for a 100-entry directory is going to
> > require a 100 serialized upcalls to idmapd (and then possibly ldap), and
> > by default it looks like the idmapd cache will go cold after 10
> > minutes....  Not hard to imagine that could be a problem.
> > 
> > Running multiple idmapd process would be easy and might help?  Though
> > not if the client's just giving us the getattrs one at a time.
> > 
> > Or maybe the problem's somewhere else entirely, but that's a real bug if
> > we aren't giving good performance on /home.
> 
> I did some experimenting..
> On both 'client' and 'server':
>   for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
> >> /etc/passwd
> 
> On server in suitable directory
> 
>   for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done
> 
> Mount that directory onto the client with NFSv3 and "time ls -l" takes a
> little under 4 seconds.
> Mount with NFSv4 and it takes about the same.  However:

OK, that's interesting.  I wonder what the problem is, then?  I can't
think of what else would make /home different.

> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
> drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
> drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
> ....

Oops.

> tcpdump shows the server is returning the write stuff, but something if going
> wrong on the client.  I've tried unmounting/remounting and killing/restarting
> rpc.idmapd.
> I had some config problems previously .. is there any chance that these
> unknown entries are in a cache?  Any easy way to view or flush the cache?

Not that I know of.

What client version is this, and is it using the new (nfsidmap) or old
(idmapd) idmapper?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Oct. 8, 2012, 12:20 p.m. UTC | #9
On Mon, Oct 08, 2012 at 07:42:34AM -0400, Steve Dickson wrote:
> Assuming you are using the keyring based idmapper, "nfsidmap -cv" will
> clear the keyring of user and group ids. See nfsidmap(5).

Oh, good, I'd missed that....

> If you using rpc.idmapd, I believe 
>     echo `date +'%s'` > /proc/net/rpc/nfs4.idtoname/flush
> will do the trick.... The CITI faq 
>     http://www.citi.umich.edu/projects/nfsv4/linux/faq/
> has a section on work with this cache...

No, that's just the server-side cache, but if as Neil says the
on-the-wire replies look correct, then the problem is client-side.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
malahal naineni Oct. 8, 2012, 1:54 p.m. UTC | #10
NeilBrown [neilb@suse.de] wrote:
> Mount with NFSv4 and it takes about the same.  However:
> 
> .....
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
> drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
> drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
> drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
> ....
> 
> 
> tcpdump shows the server is returning the write stuff, but something if going
> wrong on the client.  I've tried unmounting/remounting and killing/restarting
> rpc.idmapd.

As you know 4294967294 is (-2, nfs nobody), I have seen this issue with NFS
server sending numeric ids by default in AUTH_SYS (commit
e9541ce8efc22c233a045f091c2b969923709038), but the client can't handle
them (lack of commit 5cf36cfdc8caa2724738ad0842c5c3dd02f309dc in client
code).

I hand patched server commit, but my client was an older one. That is
how I got into my issue. Not sure, if you are running into a similar
issue.

Regards, Malahal.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Oct. 8, 2012, 2:18 p.m. UTC | #11
On Mon, Oct 08, 2012 at 08:54:52AM -0500, Malahal Naineni wrote:
> NeilBrown [neilb@suse.de] wrote:
> > Mount with NFSv4 and it takes about the same.  However:
> > 
> > .....
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
> > drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
> > drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
> > ....
> > 
> > 
> > tcpdump shows the server is returning the write stuff, but something if going
> > wrong on the client.  I've tried unmounting/remounting and killing/restarting
> > rpc.idmapd.
> 
> As you know 4294967294 is (-2, nfs nobody), I have seen this issue with NFS
> server sending numeric ids by default in AUTH_SYS (commit
> e9541ce8efc22c233a045f091c2b969923709038), but the client can't handle
> them (lack of commit 5cf36cfdc8caa2724738ad0842c5c3dd02f309dc in client
> code).
> 
> I hand patched server commit, but my client was an older one. That is
> how I got into my issue. Not sure, if you are running into a similar
> issue.

Oh, could be--but then why would some of the id's still be mapped
correctly?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
malahal naineni Oct. 8, 2012, 3:26 p.m. UTC | #12
J. Bruce Fields [bfields@fieldses.org] wrote:
> On Mon, Oct 08, 2012 at 08:54:52AM -0500, Malahal Naineni wrote:
> > NeilBrown [neilb@suse.de] wrote:
> > > Mount with NFSv4 and it takes about the same.  However:
> > > 
> > > .....
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
> > > drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
> > > drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
> > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
> > > ....
> > > 
> > > 
> > > tcpdump shows the server is returning the write stuff, but something if going
> > > wrong on the client.  I've tried unmounting/remounting and killing/restarting
> > > rpc.idmapd.
> > 
> > As you know 4294967294 is (-2, nfs nobody), I have seen this issue with NFS
> > server sending numeric ids by default in AUTH_SYS (commit
> > e9541ce8efc22c233a045f091c2b969923709038), but the client can't handle
> > them (lack of commit 5cf36cfdc8caa2724738ad0842c5c3dd02f309dc in client
> > code).
> > 
> > I hand patched server commit, but my client was an older one. That is
> > how I got into my issue. Not sure, if you are running into a similar
> > issue.
> 
> Oh, could be--but then why would some of the id's still be mapped
> correctly?

Wild guess, those objects are created by client and didn't get their
attributes updated yet from server???

FYI, a co-worker here had RHEL6.3 server and RHEL6.2 client that
exhibited this nobody issue with NFSv4.

Regards, Malahal.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
NeilBrown Oct. 9, 2012, 12:30 a.m. UTC | #13
On Mon, 08 Oct 2012 07:42:34 -0400 Steve Dickson <SteveD@redhat.com> wrote:

> 
> 
> On 08/10/12 02:03, NeilBrown wrote:
> > On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> > wrote:
> > 
> >> On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
> >>> On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> >>> wrote:
> >>>
> >>>> On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> >>>>> On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> >>>>>> On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> >>>>>>> On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> I guess you're right.  So it starts to sound more like: "you have a
> >>>>>>>> confusing setup.  Your export configuration says one thing, and your
> >>>>>>>> filesystem permissions say another.  Under NFSv3 the confusion didn't
> >>>>>>>> matter, but now it does--time to fix it."
> >>>>>>>>
> >>>>>>>
> >>>>>>> That's the best I could come to - I'm glad to have it confirmed.  Thanks!
> >>>>>>>
> >>>>>>> It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> >>>>>>> is in use, and uses 'root' when auth_sys is used (which might be anon if
> >>>>>>> "root_squash" is active, but might not).
> >>>>>>> I wonder if it would work to use auth_none for the mount-time lookup, just
> >>>>>>> for consistency..
> >>>>>>>
> >>>>>>> Is the following appropriate?  Is there somewhere better to put this caveat?
> >>>>>>
> >>>>>> Unfortunately, it's more complicated than this, as it depends on client
> >>>>>> implementation and configuration details.
> >>>>>>
> >>>>>> Something like this would be more accurate but possibly too long:
> >>>>>>
> >>>>>> 	Note that under NFSv2 and NFSv3, the mount path is traversed by
> >>>>>> 	mountd acting as root, but under NFSv4 the mount path is looked
> >>>>>> 	up using the client's credentials.  This means that, for
> >>>>>> 	example, if a client mounts using a krb5 credential that the
> >>>>>> 	server maps to an "anonmyous" user, then the mount will only
> >>>>>> 	succeed if that directory and all its parents allow eXecute
> >>>>>> 	permissions.
> >>>>>
> >>>>> So you're listing this as a "feature" rather than a bug? There should be
> >>>>> no reason to constrain the pseudofs to use the permission checks from
> >>>>> the underlying filesystem.
> >>>>
> >>>> I'd be fine with that.
> >>>>
> >>>> (That still leaves some subtle v3/v4 difference in the case of mount
> >>>> paths underneath an export?
> >>>>
> >>>> What *is* the existing mountd behavior there, exactly?  I'm inclined to
> >>>> think allowing mounts of arbitrary subdirectories is a bug, but maybe
> >>>> there's some historical reason for it or maybe someone already depends
> >>>> on it.)
> >>>>
> >>>> --b.
> >>>
> >>> The behaviour is simple that you mount a filehandle (typically belonging to a
> >>> directory) and that filehandle can be anything inside any exported filesystem.
> >>
> >> It's not the nfsd behavior that bothers me--there's nothing we can do
> >> about the fact that access by filehandle can bypass directory
> >> permissions.
> >>
> >> What bothers is that mountd will apparently allow anyone to do a lookup
> >> anywhere in an exported filesystem.
> > 
> > Not anyone - it requires a privileged source port from a known host.
> > So it is only "anyone who can get 'root'".
> > 
> >>
> >> I don't know--maybe I shouldn't be so concerned about the possibility a
> >> rogue user could figure out that my "Music" directory includes an
> >> unreasonable number of Miles Davis titles.
> >>
> >>> Yes, please do depend on being able to mount filehandles that aren't to root
> >>> of a filesystem.
> >>>
> >>> The case the brought this issue to my attention involved the server having
> >>> a directory containing hundreds of home directories.  This directory is
> >>> exported.
> >>>
> >>> If they mount that top level directory they get horrible performance.  If
> >>> they use an automounter to just mount the homes that are accessed it works
> >>> better.  They weren't able to explain why but my guess is that some tools
> >>> (GUI filesystem browser) would occasionally do the equivalent of "ls  -l" of
> >>> the top level directory which would hammer nfs-idmapd and probably ldap....
> >>> though you would think that would get cached and not be a problem for long.
> >>> So maybe it is more subtle than that.
> >>
> >> Getting all the id->name mappings for a 100-entry directory is going to
> >> require a 100 serialized upcalls to idmapd (and then possibly ldap), and
> >> by default it looks like the idmapd cache will go cold after 10
> >> minutes....  Not hard to imagine that could be a problem.
> >>
> >> Running multiple idmapd process would be easy and might help?  Though
> >> not if the client's just giving us the getattrs one at a time.
> >>
> >> Or maybe the problem's somewhere else entirely, but that's a real bug if
> >> we aren't giving good performance on /home.
> > 
> > I did some experimenting..
> > On both 'client' and 'server':
> >   for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
> >>> /etc/passwd
> > 
> > On server in suitable directory
> > 
> >   for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done
> > 
> > Mount that directory onto the client with NFSv3 and "time ls -l" takes a
> > little under 4 seconds.
> > Mount with NFSv4 and it takes about the same.  However:
> > 
> > .....
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
> > drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
> > drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
> > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
> > ....
> > 
> > 
> > tcpdump shows the server is returning the write stuff, but something if going
> > wrong on the client.  I've tried unmounting/remounting and killing/restarting
> > rpc.idmapd.
> > I had some config problems previously .. is there any chance that these
> > unknown entries are in a cache?  Any easy way to view or flush the cache?
> Assuming you are using the keyring based idmapper, "nfsidmap -cv" will
> clear the keyring of user and group ids. See nfsidmap(5).

Thanks... though I'm running some ancient system which only has nfs-utils
1.2.5 and so "nfsidmap -cv" returns silently, but does nothing.
That's OK, I have source -- build, copy, test...

# /tmp/nfsidmap -cv
nfsidmap: fopen(/proc/keys) failed: No such file or directory


Hmm, not what I was expecting ... grep grep ahhh:

config KEYS_DEBUG_PROC_KEYS
        bool "Enable the /proc/keys file by which keys may be viewed"

# zcat /proc/config.gz | grep KEYS_DEBUG_PROC
# CONFIG_KEYS_DEBUG_PROC_KEYS is not set


That explains it then - we need a debug option set or we cannot flush the
idmap cache.  I guess flushing a cache is a debugging operation, but its a
bit surprising.  And in my case: annoying.

Would you expect distros to enable CONFIG_KEYS_DEBUG_PROC_KEYS?  If so I'll
get it enabled for SUSE (it is enabled in the 'debug' kernel, but not
'desktop' or 'default).  If not, the man page maybe should safe that -c and
-r require a kernel with debugging enabled.

But I set up another machine as the client and configured it properly before
testing, and everything works fine and reasonably fast.  So my guess that
id lookup for thousands of different ids caused slowness was probably wrong.


Thanks,
NeilBrown
NeilBrown Oct. 9, 2012, 12:33 a.m. UTC | #14
On Mon, 8 Oct 2012 10:26:47 -0500 Malahal Naineni <malahal@us.ibm.com> wrote:

> J. Bruce Fields [bfields@fieldses.org] wrote:
> > On Mon, Oct 08, 2012 at 08:54:52AM -0500, Malahal Naineni wrote:
> > > NeilBrown [neilb@suse.de] wrote:
> > > > Mount with NFSv4 and it takes about the same.  However:
> > > > 
> > > > .....
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2974
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2975
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2976
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2977
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2978
> > > > drwxr-xr-x 2 u2979      root 4096 Oct  8 16:19 2979
> > > > drwxr-xr-x 2 u2980      root 4096 Oct  8 16:19 2980
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2981
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2982
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2983
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2984
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2985
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct  8 16:19 2986
> > > > ....
> > > > 
> > > > 
> > > > tcpdump shows the server is returning the write stuff, but something if going
> > > > wrong on the client.  I've tried unmounting/remounting and killing/restarting
> > > > rpc.idmapd.
> > > 
> > > As you know 4294967294 is (-2, nfs nobody), I have seen this issue with NFS
> > > server sending numeric ids by default in AUTH_SYS (commit
> > > e9541ce8efc22c233a045f091c2b969923709038), but the client can't handle
> > > them (lack of commit 5cf36cfdc8caa2724738ad0842c5c3dd02f309dc in client
> > > code).
> > > 
> > > I hand patched server commit, but my client was an older one. That is
> > > how I got into my issue. Not sure, if you are running into a similar
> > > issue.
> > 
> > Oh, could be--but then why would some of the id's still be mapped
> > correctly?
> 
> Wild guess, those objects are created by client and didn't get their
> attributes updated yet from server???
> 
> FYI, a co-worker here had RHEL6.3 server and RHEL6.2 client that
> exhibited this nobody issue with NFSv4.
> 
> Regards, Malahal.

I think the original cause of my problem was that I had inconsistent settings
for 'Domain' in 'idmapd.conf'.  That seems to have resulted in 'nobody'
entries being cached which I now cannot flush.

I'm running a 3.5 kernel on the client, so the issues you mentioned won't be
affecting me.

Thanks,
NeilBrown
diff mbox

Patch

diff --git a/utils/exportfs/exports.man b/utils/exportfs/exports.man
index bc1de73..91e4b9c 100644
--- a/utils/exportfs/exports.man
+++ b/utils/exportfs/exports.man
@@ -126,6 +126,10 @@  will be enforced only for access using flavors listed in the immediately
 preceding sec= option.  The only options that are permitted to vary in
 this way are ro, rw, no_root_squash, root_squash, and all_squash.
 .PP
+When RPCSEC_GSS is used with NFSv4, a client will only be able to mount a
+directory if that directory and all its ancestors give eXecute access
+to "world".
+.PP
 .SS General Options
 .BR exportfs
 understands the following export options: