Message ID | 1380220810-12909-1-git-send-email-bhalevy@primarydata.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Empty header are pretty useless. Also why would you want a header outside fs/nfsd/ ? -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-09-29 14:43, Christoph Hellwig wrote: > Empty header are pretty useless. Right. This was useful early on while we re-ordered the patches that followed. No need for that now so I'll squash this patch accordingly. > Also why would you want a header > outside fs/nfsd/ ? This header contains the file system interface. Benny > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 29, 2013 at 03:12:41PM +0300, Benny Halevy wrote: > > Also why would you want a header > > outside fs/nfsd/ ? > > This header contains the file system interface. Any interface for the filesystem should be part of exportfs.h, not something nfs-specific. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-09-29 15:13, Christoph Hellwig wrote: > On Sun, Sep 29, 2013 at 03:12:41PM +0300, Benny Halevy wrote: >>> Also why would you want a header >>> outside fs/nfsd/ ? >> >> This header contains the file system interface. > > Any interface for the filesystem should be part of exportfs.h, not > something nfs-specific. Makes sense. Thanks. Bruce - are you ok with moving the pnfs interface definitions to include/linux/exportfs.h along with struct export_operations? In fact we can actually extend struct export_operations rather than adding pnfs_export_operations... Benny > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 29, 2013 at 03:20:33PM +0300, Benny Halevy wrote: > Bruce - are you ok with moving the pnfs interface definitions to > include/linux/exportfs.h along with struct export_operations? > > In fact we can actually extend struct export_operations rather > than adding pnfs_export_operations... Yes, it probably should go into the export ops, although the actual method signatures might need to be made a litle less nfs-specific for that. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 29, 2013 at 05:21:30AM -0700, Christoph Hellwig wrote: > > Bruce - are you ok with moving the pnfs interface definitions to > > include/linux/exportfs.h along with struct export_operations? > > > > In fact we can actually extend struct export_operations rather > > than adding pnfs_export_operations... > > Yes, it probably should go into the export ops, although the actual > method signatures might need to be made a litle less nfs-specific for > that. I jsut took a brief look over the diff for the whole series in the git tree and the old tree that still had block and exofs servers and have revised my opinion a little bit: - the should be a layout_type field in struct export_operations, indicating that a filesystem support some sort of pnfs-like export. - there should be a struct pnfs_operations, but it should be confined to fs/nfsd: each layout can be a separate loadable module and gets registered there. For the initial file layout that module is self-contained, but for e.g. block or objects it would have call into the filesystem through export_ops, although way lower level than the NFS XDR level, e.g. for block there would be one of to get the extent map, and one to allocate an extent. This way we alsod avoid the dependcy on nfsd in the filesystems that the cureent version introduces. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-09-29 15:35, Christoph Hellwig wrote: > On Sun, Sep 29, 2013 at 05:21:30AM -0700, Christoph Hellwig wrote: >>> Bruce - are you ok with moving the pnfs interface definitions to >>> include/linux/exportfs.h along with struct export_operations? >>> >>> In fact we can actually extend struct export_operations rather >>> than adding pnfs_export_operations... >> >> Yes, it probably should go into the export ops, although the actual >> method signatures might need to be made a litle less nfs-specific for >> that. > > I jsut took a brief look over the diff for the whole series in the git > tree and the old tree that still had block and exofs servers and have > revised my opinion a little bit: > > > - the should be a layout_type field in struct export_operations, > indicating that a filesystem support some sort of pnfs-like export. > - there should be a struct pnfs_operations, but it should be confined > to fs/nfsd: each layout can be a separate loadable module and gets > registered there. For the initial file layout that module is > self-contained, but for e.g. block or objects it would have > call into the filesystem through export_ops, although way lower level This makes sense for blocks for its use of the generic block allocation and mapping calls (and it needs a new call for committing uninitialized extents) But for objects there are no such calls and the integration with exofs is pretty intimate. Benny > than the NFS XDR level, e.g. for block there would be one of to get > the extent map, and one to allocate an extent. > > This way we alsod avoid the dependcy on nfsd in the filesystems that the > cureent version introduces. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/29/2013 05:35 AM, Christoph Hellwig wrote:> On Sun, Sep 29, 2013 at 05:21:30AM -0700, Christoph Hellwig wrote: >>> Bruce - are you ok with moving the pnfs interface definitions to >>> include/linux/exportfs.h along with struct export_operations? >>> >>> In fact we can actually extend struct export_operations rather >>> than adding pnfs_export_operations... >> >> Yes, it probably should go into the export ops, although the actual >> method signatures might need to be made a litle less nfs-specific for >> that. > > I jsut took a brief look over the diff for the whole series in the git > tree and the old tree that still had block and exofs servers and have > revised my opinion a little bit: > > > - the should be a layout_type field in struct export_operations, > indicating that a filesystem support some sort of pnfs-like export. The pnfs protocol and people have plans to, allow a multi typed layouts from the same super-block. It is a per file attribute. It even allows a multi protocol access to the same file. The only flag should be the presence of the layout_get vector that should indicate support or lack of it. (In fact I would remove layout_type field completly it's place is only as an input to LO_GET and DEVICE_INFO) > - there should be a struct pnfs_operations, but it should be confined > to fs/nfsd: each layout can be a separate loadable module and gets > registered there. For the initial file layout that module is > self-contained, but for e.g. block or objects it would have > call into the filesystem through export_ops, although way lower level > than the NFS XDR level, e.g. for block there would be one of to get > the extent map, and one to allocate an extent. > No! This does not make any sense. What you say does not fit any model of any cluster filesystem today. - Again the FS can support any protocol. - Only the FS understand the structure and layout of the file access. Any other model is a specific implementation and breaks abstraction. The only true abstraction is the LO_GET LO_RETURN LO_COMMIT DEVICE_INFO and LO_CB_RECALL. anything else is making assumptions. There is a pnfs vector and it is at this abstraction level exactly. > This way we alsod avoid the dependcy on nfsd in the filesystems that the > cureent version introduces. There is no "dependency on nfsd in the filesystems" The only dependency the FS has is an import of some library routines at exportfs that take an abstract layout and device descriptions and encode them into an XDR buffer. But the FS knows nothing of the XDR and the NFSD is free to unload at any moment without forcing the FS to unload first or at all. This is actually tested, in fact I do this all the time when I want to start fresh and have NFSD close all resources on the FS. Nothing changed, the FS is independent and NFSD is dependent on the FS, but in an abstract way via an exports vector. Where did you see such dependency? > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Cheers Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/29/2013 05:20 AM, Benny Halevy wrote: > Makes sense. Thanks. > > Bruce - are you ok with moving the pnfs interface definitions to > include/linux/exportfs.h along with struct export_operations? > I disagree this is a bloat. It is big enough as it is. For code clarity and maintenance we should split. Yes they are related but then lots of stuff are related but we want to keep them separate compact and readable. why put everything in the same file, it does not make any sense. You have a library exportfs that exports a few related but different interfaces, In fact this header is not exported from exportfs it is the type system that defines the pnfs operations. They are so big and verbose they call for a separate header. > In fact we can actually extend struct export_operations rather > than adding pnfs_export_operations... > This is a great waist of space. Any FS that does not support pnfs, Which is currently all but exofs, will have 7 NULLs instead of one. And putting a struct pnfs_export_operations pointer inside struct export_operations gives you nothing but an extra dereference and funny looking code. Current system is just the most simple and most efficient. Why the extra complexity? > Benny Cheers Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 30, 2013 at 06:23:42PM +0300, Benny Halevy wrote: > This makes sense for blocks for its use of the generic block allocation and mapping > calls (and it needs a new call for committing uninitialized extents) > But for objects there are no such calls and the integration with exofs > is pretty intimate. That's just because there is no proper split between exofs and the pnfsd-objects layer. The split between the two doesn't seem too hard and would dramatically improve the interface. With that and moving the recall handling on truncate to generic code where it belongs almost nothin pnfs-specific will be left in exofs. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 30, 2013 at 06:05:18PM -0700, Boaz Harrosh wrote: > The pnfs protocol and people have plans to, allow a multi typed > layouts from the same super-block. It is a per file attribute. > It even allows a multi protocol access to the same file. > The only flag should be the presence of the layout_get vector > that should indicate support or lack of it. The current method doesn't help with that as it can return a single type only anyway. So in principle I agree with you, but the way to fix it is not to keep the method, but to make sure it returns a bitmap of supported layouts. > > - there should be a struct pnfs_operations, but it should be confined > > to fs/nfsd: each layout can be a separate loadable module and gets > > registered there. For the initial file layout that module is > > self-contained, but for e.g. block or objects it would have > > call into the filesystem through export_ops, although way lower level > > than the NFS XDR level, e.g. for block there would be one of to get > > the extent map, and one to allocate an extent. > > > > No! This does not make any sense. What you say does not fit any model of any > cluster filesystem today. > > - Again the FS can support any protocol. > - Only the FS understand the structure and layout of the file access. Any > other model is a specific implementation and breaks abstraction. The only true > abstraction is the LO_GET LO_RETURN LO_COMMIT DEVICE_INFO and LO_CB_RECALL. anything > else is making assumptions. > > There is a pnfs vector and it is at this abstraction level exactly. No, the problem is that the pnfs_export_operations are entirely at the wrong level, as I tried to explain. The right level is very different for the different layouts: - for files it needs to boil down to a: - get a list of devices - given an inode/offset return the layout - for block it's get a block map for a file / create an unwritten extent / convert it to written - for object it seems (not too familar): - get a list of devices for this fs - given an inode/offset return the layout - tell the fs that I/O has finished As all the layouts operate on different data structures it makes sense to make the methods operate on those, and keep the boilerplate code including the XDR encoding/decoding in one single place. Now how these pnfsd object layout drivers communicate with the fs I don't have an opinion on until we see the actual code, maybe we need a pnfs_<layout>_ops if it's complicated enough. For the files case that can just call into dlm directly currently as we have no other interesting cluster fs in tree it's a mood point. For block it's simple enough that I'd just add it to export_ops, if not by that time we redo the current get_blocks mess in a way that we can simply piggyback it on that which would be even easier. > > This way we alsod avoid the dependcy on nfsd in the filesystems that the > > cureent version introduces. > > There is no "dependency on nfsd in the filesystems" The patchset as pulled will created a depency on nfsd.ko from gfs2.ko > The only dependency the FS has is an import of some library routines > at exportfs that take an abstract layout and device descriptions and encode > them into an XDR buffer. But the FS knows nothing of the XDR and the > NFSD is free to unload at any moment without forcing the FS to unload > first or at all. > This is actually tested, in fact I do this all the time when I want to > start fresh and have NFSD close all resources on the FS. That's not what happens with the file layout as posted, and it's not something I want to see happen every, btw. In Linux we're all about proper abstractions, and letting a fs control all pnfs aspects directly instead of having common code is a receipe for tons of copy & pasted code full of different bugs if we ever get additional implementations of a layout. Not that I really expect any in tree as all the other "interested partied" have shown to be leechers that just want to keep their filesystems out of tree and most of the time as illegal propritary modules anyway. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 29, 2013 at 03:20:33PM +0300, Benny Halevy wrote: > On 2013-09-29 15:13, Christoph Hellwig wrote: > > On Sun, Sep 29, 2013 at 03:12:41PM +0300, Benny Halevy wrote: > >>> Also why would you want a header > >>> outside fs/nfsd/ ? > >> > >> This header contains the file system interface. > > > > Any interface for the filesystem should be part of exportfs.h, not > > something nfs-specific. > > Makes sense. Thanks. > > Bruce - are you ok with moving the pnfs interface definitions to > include/linux/exportfs.h along with struct export_operations? Fine by me.--b. > > In fact we can actually extend struct export_operations rather > than adding pnfs_export_operations... > > Benny > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 29, 2013 at 05:35:53AM -0700, Christoph Hellwig wrote: > On Sun, Sep 29, 2013 at 05:21:30AM -0700, Christoph Hellwig wrote: > > > Bruce - are you ok with moving the pnfs interface definitions to > > > include/linux/exportfs.h along with struct export_operations? > > > > > > In fact we can actually extend struct export_operations rather > > > than adding pnfs_export_operations... > > > > Yes, it probably should go into the export ops, although the actual > > method signatures might need to be made a litle less nfs-specific for > > that. > > I jsut took a brief look over the diff for the whole series in the git > tree and the old tree that still had block and exofs servers and have > revised my opinion a little bit: > > > - the should be a layout_type field in struct export_operations, > indicating that a filesystem support some sort of pnfs-like export. > - there should be a struct pnfs_operations, but it should be confined > to fs/nfsd: each layout can be a separate loadable module and gets > registered there. For the initial file layout that module is > self-contained, but for e.g. block or objects it would have > call into the filesystem through export_ops, although way lower level > than the NFS XDR level, e.g. for block there would be one of to get > the extent map, and one to allocate an extent. That sounds OK to me. My (possibly faulty) memory of how the xdr-encoding-in-the-fs came about: I think people weren't willing to commit to any reasonable upper limit on the size of the layout. So they originally considered something more like readdir that would loop over extents and pass them to a callback for encoding. But xdr isn't complicated so you could instead give the filesystem a simple library of xdr encoders to call. But it sounds like that's not exactly what got implemented. And maybe nobody actually has such big layouts. > This way we alsod avoid the dependcy on nfsd in the filesystems that the > cureent version introduces. Ugh--thanks for catching that. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-10-01 16:33, Christoph Hellwig wrote: > On Mon, Sep 30, 2013 at 06:05:18PM -0700, Boaz Harrosh wrote: >> The pnfs protocol and people have plans to, allow a multi typed >> layouts from the same super-block. It is a per file attribute. >> It even allows a multi protocol access to the same file. >> The only flag should be the presence of the layout_get vector >> that should indicate support or lack of it. > > The current method doesn't help with that as it can return a single type > only anyway. So in principle I agree with you, but the way to fix it is > not to keep the method, but to make sure it returns a bitmap of > supported layouts. > >>> - there should be a struct pnfs_operations, but it should be confined >>> to fs/nfsd: each layout can be a separate loadable module and gets >>> registered there. For the initial file layout that module is >>> self-contained, but for e.g. block or objects it would have >>> call into the filesystem through export_ops, although way lower level >>> than the NFS XDR level, e.g. for block there would be one of to get >>> the extent map, and one to allocate an extent. >>> >> >> No! This does not make any sense. What you say does not fit any model of any >> cluster filesystem today. >> >> - Again the FS can support any protocol. >> - Only the FS understand the structure and layout of the file access. Any >> other model is a specific implementation and breaks abstraction. The only true >> abstraction is the LO_GET LO_RETURN LO_COMMIT DEVICE_INFO and LO_CB_RECALL. anything >> else is making assumptions. >> >> There is a pnfs vector and it is at this abstraction level exactly. > > No, the problem is that the pnfs_export_operations are entirely at the > wrong level, as I tried to explain. The right level is very different > for the different layouts: > > - for files it needs to boil down to a: > > - get a list of devices > - given an inode/offset return the layout > For loosely clustered files-layout file systems the MDS would also need to handle LAYOUTCOMMIT, and also generate and handle layout recalls, depending on how much of that we can do in a generic way and how much is file system specific. > - for block it's get a block map for a file / create an unwritten > extent / convert it to written Besides converting to written there is also de-allocation of provisionally allocated extents and more advanced stuff for client assisted copy-on-write where after allocation of uninitialized extents the client can commit extents to the block map by replacing existing initialized extents with new ones. > > - for object it seems (not too familar): > > - get a list of devices for this fs > - given an inode/offset return the layout > - tell the fs that I/O has finished > > As all the layouts operate on different data structures it makes sense > to make the methods operate on those, and keep the boilerplate code > including the XDR encoding/decoding in one single place. > > Now how these pnfsd object layout drivers communicate with the fs I > don't have an opinion on until we see the actual code, maybe we need a > pnfs_<layout>_ops if it's complicated enough. The original design this patchset builds on called for implementing the layout type specifics as library code and let file systems supporting pnfs implement the high level pnfs methods and use the library for encoding and decoding layout-type specific XDR and use other helpers if needed. You're calling for a "back-end layout driver" kind of layer that will serve the layout type and will call into the file system using lower level methods. > For the files case that > can just call into dlm directly currently as we have no other > interesting cluster fs in tree it's a mood point. For block it's simple > enough that I'd just add it to export_ops, if not by that time we redo > the current get_blocks mess in a way that we can simply piggyback it on > that which would be even easier. > On the same lines, I think that for now calling directly into exofs makes the most sense as we have no other object based pnfs implementation. >>> This way we alsod avoid the dependcy on nfsd in the filesystems that the >>> cureent version introduces. Yeah, I agree that this dependency can and should be taken care of. >> >> There is no "dependency on nfsd in the filesystems" > > The patchset as pulled will created a depency on nfsd.ko from gfs2.ko > yup, fs/nfsd/nfs4pnfsdlm.c exports pnfs_dlm_export_ops and implements the respective methods. This does not really belong to nfsd but rather to the fs/dlm module. Benny >> The only dependency the FS has is an import of some library routines >> at exportfs that take an abstract layout and device descriptions and encode >> them into an XDR buffer. But the FS knows nothing of the XDR and the >> NFSD is free to unload at any moment without forcing the FS to unload >> first or at all. >> This is actually tested, in fact I do this all the time when I want to >> start fresh and have NFSD close all resources on the FS. > > That's not what happens with the file layout as posted, and it's not > something I want to see happen every, btw. In Linux we're all about > proper abstractions, and letting a fs control all pnfs aspects directly > instead of having common code is a receipe for tons of copy & pasted > code full of different bugs if we ever get additional implementations of > a layout. Not that I really expect any in tree as all the other > "interested partied" have shown to be leechers that just want to keep > their filesystems out of tree and most of the time as illegal propritary > modules anyway. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-10-01 23:30, J. Bruce Fields wrote: > On Sun, Sep 29, 2013 at 05:35:53AM -0700, Christoph Hellwig wrote: >> On Sun, Sep 29, 2013 at 05:21:30AM -0700, Christoph Hellwig wrote: >>>> Bruce - are you ok with moving the pnfs interface definitions to >>>> include/linux/exportfs.h along with struct export_operations? >>>> >>>> In fact we can actually extend struct export_operations rather >>>> than adding pnfs_export_operations... >>> >>> Yes, it probably should go into the export ops, although the actual >>> method signatures might need to be made a litle less nfs-specific for >>> that. >> >> I jsut took a brief look over the diff for the whole series in the git >> tree and the old tree that still had block and exofs servers and have >> revised my opinion a little bit: >> >> >> - the should be a layout_type field in struct export_operations, >> indicating that a filesystem support some sort of pnfs-like export. >> - there should be a struct pnfs_operations, but it should be confined >> to fs/nfsd: each layout can be a separate loadable module and gets >> registered there. For the initial file layout that module is >> self-contained, but for e.g. block or objects it would have >> call into the filesystem through export_ops, although way lower level >> than the NFS XDR level, e.g. for block there would be one of to get >> the extent map, and one to allocate an extent. > > That sounds OK to me. > > My (possibly faulty) memory of how the xdr-encoding-in-the-fs came > about: I think people weren't willing to commit to any reasonable upper > limit on the size of the layout. So they originally considered > something more like readdir that would loop over extents and pass them > to a callback for encoding. But xdr isn't complicated so you could > instead give the filesystem a simple library of xdr encoders to call. > > But it sounds like that's not exactly what got implemented. And maybe > nobody actually has such big layouts. > >> This way we alsod avoid the dependcy on nfsd in the filesystems that the >> cureent version introduces. > > Ugh--thanks for catching that. I suggest moving the code in fs/nfsd/nfs4pnfsdlm.c to fs/dlm. Benny > > --b. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 02, 2013 at 02:35:42PM +0300, Benny Halevy wrote: > For loosely clustered files-layout file systems the MDS would also > need to handle LAYOUTCOMMIT, and also generate and handle layout recalls, > depending on how much of that we can do in a generic way and how much is > file system specific. Which is something we can deal with in great detail once such a filesystem and the pnfs support for it land in the kernel tree. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 02, 2013 at 02:36:54PM +0300, Benny Halevy wrote: > I suggest moving the code in fs/nfsd/nfs4pnfsdlm.c to > fs/dlm. I don't reall care where in the tree it lives, but it needs to be neither part of nfsd or the filesystem (respectively dlm in this case). -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-10-02 19:07, Christoph Hellwig wrote: > On Wed, Oct 02, 2013 at 02:36:54PM +0300, Benny Halevy wrote: >> I suggest moving the code in fs/nfsd/nfs4pnfsdlm.c to >> fs/dlm. > > I don't reall care where in the tree it lives, but it needs to be > neither part of nfsd or the filesystem (respectively dlm in this case). Just that this is dlm specific logic. For example, using dlm_ino_hash() in nfsd4_pnfs_dlm_layoutget(). Or even knowing that layout->lg_stripe_type = STRIPE_SPARSE; assumes knowledge of the underlying cluster fs implementation. Benny > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 03, 2013 at 09:02:27AM +0300, Benny Halevy wrote: > Just that this is dlm specific logic. > For example, using dlm_ino_hash() in nfsd4_pnfs_dlm_layoutget(). > Or even knowing that > layout->lg_stripe_type = STRIPE_SPARSE; > assumes knowledge of the underlying cluster fs implementation. Which in-tree or soon in-tree filesystem do you care about? And why don't we see pnfs support for it submitted instead of the fairly useless gfs2 support? -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-10-03 12:55, Christoph Hellwig wrote: > On Thu, Oct 03, 2013 at 09:02:27AM +0300, Benny Halevy wrote: >> Just that this is dlm specific logic. >> For example, using dlm_ino_hash() in nfsd4_pnfs_dlm_layoutget(). >> Or even knowing that >> layout->lg_stripe_type = STRIPE_SPARSE; >> assumes knowledge of the underlying cluster fs implementation. > > Which in-tree or soon in-tree filesystem do you care about? And why > don't we see pnfs support for it submitted instead of the fairly useless > gfs2 support? I picked gfs2 as the initial use case for simplicity and ease of review. If there is a rough consensus that it's useless and not worthy of inclusion then the one we care about the most is exofs that has a more complete pnfs implementation. Benny > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 03, 2013 at 03:29:06PM +0300, Benny Halevy wrote: > I picked gfs2 as the initial use case for simplicity and ease of review. > If there is a rough consensus that it's useless and not worthy of inclusion > then the one we care about the most is exofs that has a more complete pnfs > implementation. This was in reference to file layout implementation details, so exofs isn't a contender there. As far as exofs is concerned a pnfs implementation based on it has just as much toy status as the current gfs2 one. While the pnfs side of it might as well be a lot better, a filesystem that lacks all the integrity and scalability features developed in the last 30 years can't be considered more than a proof of concept. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/03/2013 08:29 AM, Benny Halevy wrote: > On 2013-10-03 12:55, Christoph Hellwig wrote: >> >On Thu, Oct 03, 2013 at 09:02:27AM +0300, Benny Halevy wrote: >>> >>Just that this is dlm specific logic. >>> >>For example, using dlm_ino_hash() in nfsd4_pnfs_dlm_layoutget(). >>> >>Or even knowing that >>> >> layout->lg_stripe_type = STRIPE_SPARSE; >>> >>assumes knowledge of the underlying cluster fs implementation. >> > >> >Which in-tree or soon in-tree filesystem do you care about? And why >> >don't we see pnfs support for it submitted instead of the fairly useless >> >gfs2 support? > I picked gfs2 as the initial use case for simplicity and ease of review. > If there is a rough consensus that it's useless and not worthy of inclusion > then the one we care about the most is exofs that has a more complete pnfs > implementation. > > Benny > I don't see having GFS2 supported as a base for pNFS as useless. Christoph, is this a concern about GFS2 being too complicated for normal deployment or a lack in the pNFS support on top of it? thanks! Ric -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 03, 2013 at 09:12:24AM -0400, Ric Wheeler wrote: > >>>Which in-tree or soon in-tree filesystem do you care about? And why > >>>don't we see pnfs support for it submitted instead of the fairly useless > >>>gfs2 support? > >I picked gfs2 as the initial use case for simplicity and ease of review. > >If there is a rough consensus that it's useless and not worthy of inclusion > >then the one we care about the most is exofs that has a more complete pnfs > >implementation. > > > >Benny > > > > I don't see having GFS2 supported as a base for pNFS as useless. > Christoph, is this a concern about GFS2 being too complicated for > normal deployment or a lack in the pNFS support on top of it? Fairly useless was specific to the particular implementation: - which in the stipped down version here only supports DS access for reads - which in the previous version showed worse performance than always going through the MDS I don't have a problem with using GFS2 by itself, but any implementation proposed should actually show signifiant real life benefits before it gets merged. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/03/2013 09:17 AM, Christoph Hellwig wrote: > On Thu, Oct 03, 2013 at 09:12:24AM -0400, Ric Wheeler wrote: >>>>> Which in-tree or soon in-tree filesystem do you care about? And why >>>>> don't we see pnfs support for it submitted instead of the fairly useless >>>>> gfs2 support? >>> I picked gfs2 as the initial use case for simplicity and ease of review. >>> If there is a rough consensus that it's useless and not worthy of inclusion >>> then the one we care about the most is exofs that has a more complete pnfs >>> implementation. >>> >>> Benny >>> >> I don't see having GFS2 supported as a base for pNFS as useless. >> Christoph, is this a concern about GFS2 being too complicated for >> normal deployment or a lack in the pNFS support on top of it? > Fairly useless was specific to the particular implementation: > > - which in the stipped down version here only supports DS access for > reads > - which in the previous version showed worse performance than always > going through the MDS > > I don't have a problem with using GFS2 by itself, but any implementation > proposed should actually show signifiant real life benefits before it > gets merged. > Makes sense, thanks! Ric -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-10-03 16:18, Ric Wheeler wrote: > On 10/03/2013 09:17 AM, Christoph Hellwig wrote: >> On Thu, Oct 03, 2013 at 09:12:24AM -0400, Ric Wheeler wrote: >>>>>> Which in-tree or soon in-tree filesystem do you care about? And why >>>>>> don't we see pnfs support for it submitted instead of the fairly useless >>>>>> gfs2 support? >>>> I picked gfs2 as the initial use case for simplicity and ease of review. >>>> If there is a rough consensus that it's useless and not worthy of inclusion >>>> then the one we care about the most is exofs that has a more complete pnfs >>>> implementation. >>>> >>>> Benny >>>> >>> I don't see having GFS2 supported as a base for pNFS as useless. >>> Christoph, is this a concern about GFS2 being too complicated for >>> normal deployment or a lack in the pNFS support on top of it? >> Fairly useless was specific to the particular implementation: >> >> - which in the stipped down version here only supports DS access for >> reads >> - which in the previous version showed worse performance than always >> going through the MDS >> >> I don't have a problem with using GFS2 by itself, but any implementation >> proposed should actually show signifiant real life benefits before it >> gets merged. >> The question is what is the minimum value for submitting upstream... The thing pnfs over dlm/gfs2 is missing mostly is supporting read/write layout. One could use them load balancing, e.g. by either redirecting to a node holding an exclusive lock on the file, if there is one, or dlm_ino_hash in its absence. Benny > > Makes sense, thanks! > > Ric > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 03, 2013 at 05:19:31PM +0300, Benny Halevy wrote: > The question is what is the minimum value for submitting upstream... > > The thing pnfs over dlm/gfs2 is missing mostly is supporting read/write layout. > One could use them load balancing, e.g. by either redirecting to a node > holding an exclusive lock on the file, if there is one, or dlm_ino_hash in its absence. One could do a lot of things, especially given infinite time. But what does the current code buy us? We need data on that to justify merging such a large chunk of code. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/03/2013 10:21 AM, Christoph Hellwig wrote: > On Thu, Oct 03, 2013 at 05:19:31PM +0300, Benny Halevy wrote: >> The question is what is the minimum value for submitting upstream... >> >> The thing pnfs over dlm/gfs2 is missing mostly is supporting read/write layout. >> One could use them load balancing, e.g. by either redirecting to a node >> holding an exclusive lock on the file, if there is one, or dlm_ino_hash in its absence. > One could do a lot of things, especially given infinite time. But what > does the current code buy us? We need data on that to justify merging > such a large chunk of code. > I think that we can help figure out how to get something reasonably useful ready for upstream, just need to get some people spun up to help test out some of these ideas... ric -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-10-03 17:24, Ric Wheeler wrote: > On 10/03/2013 10:21 AM, Christoph Hellwig wrote: >> On Thu, Oct 03, 2013 at 05:19:31PM +0300, Benny Halevy wrote: >>> The question is what is the minimum value for submitting upstream... >>> >>> The thing pnfs over dlm/gfs2 is missing mostly is supporting read/write layout. >>> One could use them load balancing, e.g. by either redirecting to a node >>> holding an exclusive lock on the file, if there is one, or dlm_ino_hash in its absence. >> One could do a lot of things, especially given infinite time. But what >> does the current code buy us? We need data on that to justify merging >> such a large chunk of code. >> > > I think that we can help figure out how to get something reasonably useful ready > for upstream, just need to get some people spun up to help test out some of > these ideas... That would be very helpful. Much appreciated! Benny > > ric > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfsd/pnfsd.h b/fs/nfsd/pnfsd.h new file mode 100644 index 0000000..65fb57e --- /dev/null +++ b/fs/nfsd/pnfsd.h @@ -0,0 +1,37 @@ +/* + * Copyright (c) 2005 The Regents of the University of Michigan. + * All rights reserved. + * + * Andy Adamson <andros@umich.edu> + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + */ + +#ifndef LINUX_NFSD_PNFSD_H +#define LINUX_NFSD_PNFSD_H + +#endif /* LINUX_NFSD_PNFSD_H */ diff --git a/include/linux/nfsd/nfsd4_pnfs.h b/include/linux/nfsd/nfsd4_pnfs.h new file mode 100644 index 0000000..9e7d95e --- /dev/null +++ b/include/linux/nfsd/nfsd4_pnfs.h @@ -0,0 +1,37 @@ +/* + * Copyright (c) 2006 The Regents of the University of Michigan. + * All rights reserved. + * + * Andy Adamson <andros@umich.edu> + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + */ + +#ifndef _LINUX_NFSD_NFSD4_PNFS_H +#define _LINUX_NFSD_NFSD4_PNFS_H + +#endif /* _LINUX_NFSD_NFSD4_PNFS_H */