Message ID | 1470861005.2694.12.camel@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Aug 10, 2016 at 4:30 PM, Jeff Layton <jlayton@redhat.com> wrote: > On Wed, 2016-08-10 at 16:08 -0400, Patrick Donnelly wrote: >> On Wed, Aug 10, 2016 at 12:30 PM, Jeff Layton <jlayton@redhat.com> >> wrote: >> > >> > The CEPH_INO_DOTDOT thing is quite strange. Under most OS (Linux >> > included), the parent of the root is itself. IOW, at the root, '.' >> > and >> > '..' refer to the same inode. >> > >> > Change the ceph client to do the same, as this allows users to get >> > valid stat info for '..', as well as elimnating some special- >> > casing. >> > >> > Signed-off-by: Jeff Layton <jlayton@redhat.com> >> >> Don't forget Client::_lookup: >> >> if (dname == "..") { >> if (dir->dn_set.empty()) >> r = -ENOENT; >> else >> *target = dir->get_first_parent()->dir->parent_inode; //dirs >> can't be hard-linked >> goto done; >> } >> >> Otherwise LGTM. >> > > > Ahh, thanks. So will dir->dn_set.empty() be true at the root? If so, > then something like the patch below? Well, that's tricky actually. My understanding is that if dn_set is empty then either the inode is unlinked or it is the root inode (from the client's perspective). So the below patch is probably not quite right? I think if the directory is unlinked but not the root, its ".." should still refer to its first parent? The ENOENT error is probably wrong. > Note that this patch is not strictly necessary, but it does simplify > some other changes that I have queued up: I think the patch is a good change but there may be some other code paths that need fixed. This change needs some simple tests.
On Wed, 2016-08-10 at 16:46 -0400, Patrick Donnelly wrote: > > On Wed, Aug 10, 2016 at 4:30 PM, Jeff Layton <jlayton@redhat.com> wrote: > > > > On Wed, 2016-08-10 at 16:08 -0400, Patrick Donnelly wrote: > > > > > > > > > On Wed, Aug 10, 2016 at 12:30 PM, Jeff Layton <jlayton@redhat.com> > > > wrote: > > > > > > > > > > > > The CEPH_INO_DOTDOT thing is quite strange. Under most OS (Linux > > > > included), the parent of the root is itself. IOW, at the root, '.' > > > > and > > > > '..' refer to the same inode. > > > > > > > > Change the ceph client to do the same, as this allows users to get > > > > valid stat info for '..', as well as elimnating some special- > > > > casing. > > > > > > > > > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > > > > > > Don't forget Client::_lookup: > > > > > > if (dname == "..") { > > > if (dir->dn_set.empty()) > > > r = -ENOENT; > > > else > > > *target = dir->get_first_parent()->dir->parent_inode; //dirs > > > can't be hard-linked > > > goto done; > > > } > > > > > > Otherwise LGTM. > > > > > > > > > Ahh, thanks. So will dir->dn_set.empty() be true at the root? If so, > > then something like the patch below? > > Well, that's tricky actually. My understanding is that if dn_set is > empty then either the inode is unlinked or it is the root inode (from > the client's perspective). So the below patch is probably not quite > right? I think if the directory is unlinked but not the root, its ".." > should still refer to its first parent? The ENOENT error is probably > wrong. > Ok, so is there some way to reliably tell whether it's the root? Should we instead check whether it's inode number is CEPH_INO_ROOT ? > > > > Note that this patch is not strictly necessary, but it does simplify > > some other changes that I have queued up: > > I think the patch is a good change but there may be some other code > paths that need fixed. This change needs some simple tests. > Yeah, agreed. I'll plan to add some if this patch is reasonable. I just wanted to float the patch out here as an RFC first, in case I was missing some reason that we needed to keep CEPH_INO_DOTDOT. Thanks for having a look!
On Wed, Aug 10, 2016 at 5:06 PM, Jeff Layton <jlayton@redhat.com> wrote: > On Wed, 2016-08-10 at 16:46 -0400, Patrick Donnelly wrote: >> > On Wed, Aug 10, 2016 at 4:30 PM, Jeff Layton <jlayton@redhat.com> wrote: >> > >> > On Wed, 2016-08-10 at 16:08 -0400, Patrick Donnelly wrote: >> > > >> > > > > > On Wed, Aug 10, 2016 at 12:30 PM, Jeff Layton <jlayton@redhat.com> >> > > wrote: >> > > > >> > > > >> > > > The CEPH_INO_DOTDOT thing is quite strange. Under most OS (Linux >> > > > included), the parent of the root is itself. IOW, at the root, '.' >> > > > and >> > > > '..' refer to the same inode. >> > > > >> > > > Change the ceph client to do the same, as this allows users to get >> > > > valid stat info for '..', as well as elimnating some special- >> > > > casing. >> > > > >> > > > > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com> >> > > >> > > Don't forget Client::_lookup: >> > > >> > > if (dname == "..") { >> > > if (dir->dn_set.empty()) >> > > r = -ENOENT; >> > > else >> > > *target = dir->get_first_parent()->dir->parent_inode; //dirs >> > > can't be hard-linked >> > > goto done; >> > > } >> > > >> > > Otherwise LGTM. >> > > >> > >> > >> > Ahh, thanks. So will dir->dn_set.empty() be true at the root? If so, >> > then something like the patch below? >> >> Well, that's tricky actually. My understanding is that if dn_set is >> empty then either the inode is unlinked or it is the root inode (from >> the client's perspective). So the below patch is probably not quite >> right? I think if the directory is unlinked but not the root, its ".." >> should still refer to its first parent? The ENOENT error is probably >> wrong. >> > > Ok, so is there some way to reliably tell whether it's the root? Should > we instead check whether it's inode number is CEPH_INO_ROOT ? Inode::is_root should work. By the way, I see now that the readdir code is also wrong. It should not need to check dn_set at all (just is_root()). (It doesn't matter if the directory inode is unlinked, the parent is still visible.)
On Wed, 10 Aug 2016, Jeff Layton wrote: > On Wed, 2016-08-10 at 16:08 -0400, Patrick Donnelly wrote: > > On Wed, Aug 10, 2016 at 12:30 PM, Jeff Layton <jlayton@redhat.com> > > wrote: > > > > > > The CEPH_INO_DOTDOT thing is quite strange. Under most OS (Linux > > > included), the parent of the root is itself. IOW, at the root, '.' > > > and > > > '..' refer to the same inode. > > > > > > Change the ceph client to do the same, as this allows users to get > > > valid stat info for '..', as well as elimnating some special- > > > casing. > > > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > > > > Don't forget Client::_lookup: > > > > if (dname == "..") { > > if (dir->dn_set.empty()) > > r = -ENOENT; > > else > > *target = dir->get_first_parent()->dir->parent_inode; //dirs > > can't be hard-linked > > goto done; > > } > > > > Otherwise LGTM. > > > > > Ahh, thanks. So will dir->dn_set.empty() be true at the root? If so, > then something like the patch below? > > Note that this patch is not strictly necessary, but it does simplify > some other changes that I have queued up: > > diff --git a/src/client/Client.cc b/src/client/Client.cc > index 5ab0ace4d3df..287baaf20536 100644 > --- a/src/client/Client.cc > +++ b/src/client/Client.cc > @@ -5924,7 +5924,7 @@ int Client::_lookup(Inode *dir, const string& dname, int mask, > > if (dname == "..") { > if (dir->dn_set.empty()) > - r = -ENOENT; > + *target = dir; > else > *target = dir->get_first_parent()->dir->parent_inode; //dirs can't be hard-linked > goto done; IIRC I did the dotdot thing originally because otherwise the '..' entry at the mount point in ls -al didn't point to the parent directory. Having the fs explicitly do .. at all seems pretty weird to me... it seems like the VFS should be doing this. But in any case, I'd just verify that it behaves the same way a real mount does after this change. sage
On Wed, 2016-08-10 at 21:23 +0000, Sage Weil wrote: > On Wed, 10 Aug 2016, Jeff Layton wrote: > > On Wed, 2016-08-10 at 16:08 -0400, Patrick Donnelly wrote: > > > > On Wed, Aug 10, 2016 at 12:30 PM, Jeff Layton <jlayton@redhat.com> > > > wrote: > > > > > > > > The CEPH_INO_DOTDOT thing is quite strange. Under most OS (Linux > > > > included), the parent of the root is itself. IOW, at the root, '.' > > > > and > > > > '..' refer to the same inode. > > > > > > > > Change the ceph client to do the same, as this allows users to get > > > > valid stat info for '..', as well as elimnating some special- > > > > casing. > > > > > > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > > > > > > Don't forget Client::_lookup: > > > > > > if (dname == "..") { > > > if (dir->dn_set.empty()) > > > r = -ENOENT; > > > else > > > *target = dir->get_first_parent()->dir->parent_inode; //dirs > > > can't be hard-linked > > > goto done; > > > } > > > > > > Otherwise LGTM. > > > > > > > > > Ahh, thanks. So will dir->dn_set.empty() be true at the root? If so, > > then something like the patch below? > > > > Note that this patch is not strictly necessary, but it does simplify > > some other changes that I have queued up: > > > > diff --git a/src/client/Client.cc b/src/client/Client.cc > > index 5ab0ace4d3df..287baaf20536 100644 > > --- a/src/client/Client.cc > > +++ b/src/client/Client.cc > > @@ -5924,7 +5924,7 @@ int Client::_lookup(Inode *dir, const string& dname, int mask, > > > > if (dname == "..") { > > if (dir->dn_set.empty()) > > - r = -ENOENT; > > + *target = dir; > > else > > *target = dir->get_first_parent()->dir->parent_inode; //dirs can't be hard-linked > > goto done; > > IIRC I did the dotdot thing originally because otherwise the '..' entry at > the mount point in ls -al didn't point to the parent directory. Having > the fs explicitly do .. at all seems pretty weird to me... it seems like > the VFS should be doing this. But in any case, I'd just verify that it > behaves the same way a real mount does after this change. > > sage The Linux VFS will definitely already handle ".." correctly (as you end up doing a transition to a different vfsmount). So this shouldn't affect ceph-fuse, AFAICT. I think this change would primarily be noticed by those using libcephfs directly...either in ceph_readdir/ceph_lookup (and related) calls, or during a pathwalk. That said, Patrick's suggestion to add some tests around this makes sense. I'll plan to spin that up so we can be clear on how the behavior changes. Thanks,
On Wed, 2016-08-10 at 17:15 -0400, Patrick Donnelly wrote: > > On Wed, Aug 10, 2016 at 5:06 PM, Jeff Layton <jlayton@redhat.com> wrote: > > > > On Wed, 2016-08-10 at 16:46 -0400, Patrick Donnelly wrote: > > > > > > > > > > > On Wed, Aug 10, 2016 at 4:30 PM, Jeff Layton <jlayton@redhat.com> wrote: > > > > > > > > On Wed, 2016-08-10 at 16:08 -0400, Patrick Donnelly wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Aug 10, 2016 at 12:30 PM, Jeff Layton <jlayton@redhat.com> > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > The CEPH_INO_DOTDOT thing is quite strange. Under most OS (Linux > > > > > > included), the parent of the root is itself. IOW, at the root, '.' > > > > > > and > > > > > > '..' refer to the same inode. > > > > > > > > > > > > Change the ceph client to do the same, as this allows users to get > > > > > > valid stat info for '..', as well as elimnating some special- > > > > > > casing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > > > > > > > > > > Don't forget Client::_lookup: > > > > > > > > > > if (dname == "..") { > > > > > if (dir->dn_set.empty()) > > > > > r = -ENOENT; > > > > > else > > > > > *target = dir->get_first_parent()->dir->parent_inode; //dirs > > > > > can't be hard-linked > > > > > goto done; > > > > > } > > > > > > > > > > Otherwise LGTM. > > > > > > > > > > > > > > > > > Ahh, thanks. So will dir->dn_set.empty() be true at the root? If so, > > > > then something like the patch below? > > > > > > Well, that's tricky actually. My understanding is that if dn_set is > > > empty then either the inode is unlinked or it is the root inode (from > > > the client's perspective). So the below patch is probably not quite > > > right? I think if the directory is unlinked but not the root, its ".." > > > should still refer to its first parent? The ENOENT error is probably > > > wrong. > > > > > > > Ok, so is there some way to reliably tell whether it's the root? Should > > we instead check whether it's inode number is CEPH_INO_ROOT ? > > Inode::is_root should work. By the way, I see now that the readdir > code is also wrong. It should not need to check dn_set at all (just > is_root()). (It doesn't matter if the directory inode is unlinked, the > parent is still visible.) > Ahh thanks. I'll see about fixing that up while I'm in there too. Your point about is_root is valid, but I think we should step back a min and consider how we expect this to work when we mount a subtree of the MDS root. Suppose I do: ceph_mount(&cmount, "/foo/bar/baz"); ceph_lookup(&cmount, "/..", &st); ...what should we ultimately end up stat'ing there? Should I get back the info for "bar" or "baz" ? Thanks,
On Wed, 2016-08-10 at 18:21 -0400, Jeff Layton wrote: > On Wed, 2016-08-10 at 17:15 -0400, Patrick Donnelly wrote: > > > > > > > > On Wed, Aug 10, 2016 at 5:06 PM, Jeff Layton <jlayton@redhat.com> > > > wrote: > > > > > > On Wed, 2016-08-10 at 16:46 -0400, Patrick Donnelly wrote: > > > > > > > > > > > > > > > > > > > > > > > On Wed, Aug 10, 2016 at 4:30 PM, Jeff Layton <jlayton@redhat. > > > > > com> wrote: > > > > > > > > > > On Wed, 2016-08-10 at 16:08 -0400, Patrick Donnelly wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Aug 10, 2016 at 12:30 PM, Jeff Layton <jlayto > > > > > > > > > n@redhat.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The CEPH_INO_DOTDOT thing is quite strange. Under most OS > > > > > > > (Linux > > > > > > > included), the parent of the root is itself. IOW, at the > > > > > > > root, '.' > > > > > > > and > > > > > > > '..' refer to the same inode. > > > > > > > > > > > > > > Change the ceph client to do the same, as this allows > > > > > > > users to get > > > > > > > valid stat info for '..', as well as elimnating some > > > > > > > special- > > > > > > > casing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Jeff Layton <j > > > > > > > > > > > > > > > > > > > > > layton@redhat.com> > > > > > > > > > > > > Don't forget Client::_lookup: > > > > > > > > > > > > if (dname == "..") { > > > > > > if (dir->dn_set.empty()) > > > > > > r = -ENOENT; > > > > > > else > > > > > > *target = dir->get_first_parent()->dir->parent_inode; > > > > > > //dirs > > > > > > can't be hard-linked > > > > > > goto done; > > > > > > } > > > > > > > > > > > > Otherwise LGTM. > > > > > > > > > > > > > > > > > > > > > Ahh, thanks. So will dir->dn_set.empty() be true at the root? > > > > > If so, > > > > > then something like the patch below? > > > > > > > > Well, that's tricky actually. My understanding is that if > > > > dn_set is > > > > empty then either the inode is unlinked or it is the root inode > > > > (from > > > > the client's perspective). So the below patch is probably not > > > > quite > > > > right? I think if the directory is unlinked but not the root, > > > > its ".." > > > > should still refer to its first parent? The ENOENT error is > > > > probably > > > > wrong. > > > > > > > > > > Ok, so is there some way to reliably tell whether it's the root? > > > Should > > > we instead check whether it's inode number is CEPH_INO_ROOT ? > > > > Inode::is_root should work. By the way, I see now that the readdir > > code is also wrong. It should not need to check dn_set at all (just > > is_root()). (It doesn't matter if the directory inode is unlinked, > > the > > parent is still visible.) > > > > > Ahh thanks. I'll see about fixing that up while I'm in there too. > > Your point about is_root is valid, but I think we should step back a > min and consider how we expect this to work when we mount a subtree > of > the MDS root. > > Suppose I do: > > ceph_mount(&cmount, "/foo/bar/baz"); > ceph_lookup(&cmount, "/..", &st); > > ...what should we ultimately end up stat'ing there? Should I get back > the info for "bar" or "baz" ? > Thinking out loud... I think we'd want that to return the info for "baz" since anything else would mean escaping from cmount. So, looking at the code...maybe we should alter Inode->is_root() to something like: bool is_root() { return ino == client->get_root_ino; } I don't think there are any callers of Inode->is_root currently, so I wouldn't think this would break anything. Then we could call that to see if we're at the root when doing a lookup or readdir of "..".
diff --git a/src/client/Client.cc b/src/client/Client.cc index 5ab0ace4d3df..287baaf20536 100644 --- a/src/client/Client.cc +++ b/src/client/Client.cc @@ -5924,7 +5924,7 @@ int Client::_lookup(Inode *dir, const string& dname, int mask, if (dname == "..") { if (dir->dn_set.empty()) - r = -ENOENT; + *target = dir; else *target = dir->get_first_parent()->dir->parent_inode; //dirs can't be hard-linked goto done;