Message ID | 1851200.1596472222@warthog.procyon.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [GIT,PULL] fscache rewrite | expand |
Hi Linus, Can you drop the fscache rewrite pull for now. We've seem an issue in NFS integration and need to rework the read helper a bit. I made an assumption that fscache will always be able to request that the netfs perform a read of a certain minimum size - but with NFS you can break that by setting rsize too small. We need to make the read helper able to make multiple netfs reads. This can help ceph too. Thanks, David
cifs.ko also can set rsize quite small (even 1K for example, although that will be more than 10x slower than the default 4MB so hopefully no one is crazy enough to do that). I can't imagine an SMB3 server negotiating an rsize or wsize smaller than 64K in today's world (and typical is 1MB to 8MB) but the user can specify a much smaller rsize on mount. If 64K is an adequate minimum, we could change the cifs mount option parsing to require a certain minimum rsize if fscache is selected. On Mon, Aug 10, 2020 at 10:17 AM David Howells <dhowells@redhat.com> wrote: > > Hi Linus, > > Can you drop the fscache rewrite pull for now. We've seem an issue in NFS > integration and need to rework the read helper a bit. I made an assumption > that fscache will always be able to request that the netfs perform a read of a > certain minimum size - but with NFS you can break that by setting rsize too > small. > > We need to make the read helper able to make multiple netfs reads. This can > help ceph too. > > Thanks, > David >
Steve French <smfrench@gmail.com> wrote: > cifs.ko also can set rsize quite small (even 1K for example, although > that will be more than 10x slower than the default 4MB so hopefully no > one is crazy enough to do that). You can set rsize < PAGE_SIZE? > I can't imagine an SMB3 server negotiating an rsize or wsize smaller than > 64K in today's world (and typical is 1MB to 8MB) but the user can specify a > much smaller rsize on mount. If 64K is an adequate minimum, we could change > the cifs mount option parsing to require a certain minimum rsize if fscache > is selected. I've borrowed the 256K granule size used by various AFS implementations for the moment. A 512-byte xattr can thus hold a bitmap covering 1G of file space. David
On Mon, Aug 10, 2020 at 10:48 AM David Howells <dhowells@redhat.com> wrote: > > Steve French <smfrench@gmail.com> wrote: > > > cifs.ko also can set rsize quite small (even 1K for example, although > > that will be more than 10x slower than the default 4MB so hopefully no > > one is crazy enough to do that). > > You can set rsize < PAGE_SIZE? I have never seen anyone do it and it would be crazy to set it so small (would hurt performance a lot and cause extra work on client and server) but yes it can be set very small. Apparently NFS can also set rsize to 1K as well (see https://linux.die.net/man/5/nfs) I don't mind adding a minimum rsize check for cifs.ko (preventing a user from setting rsize below page size for example) if there is a precedent for this in other fs or bug that it would cause. In general my informal perf measurements showed slight advantages to all servers with larger rsizes up to 4MB (thus cifs client will negotiate 4MB by default even if server supports larger), but overriding rsize (larger) on mount by having the user setting rsize to 8MB on mount could help perf to some servers. I am hoping we can figure out a way to automatically determine when to negotiate rsize larger than 4MB but in the meantime rsize will almost always be 4MB (or 1MB on mounts to some older servers) for cifs but some users will benefit slightly from manually setting it to 8MB. > > I can't imagine an SMB3 server negotiating an rsize or wsize smaller than > > 64K in today's world (and typical is 1MB to 8MB) but the user can specify a > > much smaller rsize on mount. If 64K is an adequate minimum, we could change > > the cifs mount option parsing to require a certain minimum rsize if fscache > > is selected. > > I've borrowed the 256K granule size used by various AFS implementations for > the moment. A 512-byte xattr can thus hold a bitmap covering 1G of file > space. > > David >
On Mon, Aug 10, 2020 at 11:48 AM David Howells <dhowells@redhat.com> wrote: > > Steve French <smfrench@gmail.com> wrote: > > > cifs.ko also can set rsize quite small (even 1K for example, although > > that will be more than 10x slower than the default 4MB so hopefully no > > one is crazy enough to do that). > > You can set rsize < PAGE_SIZE? > > > I can't imagine an SMB3 server negotiating an rsize or wsize smaller than > > 64K in today's world (and typical is 1MB to 8MB) but the user can specify a > > much smaller rsize on mount. If 64K is an adequate minimum, we could change > > the cifs mount option parsing to require a certain minimum rsize if fscache > > is selected. > > I've borrowed the 256K granule size used by various AFS implementations for > the moment. A 512-byte xattr can thus hold a bitmap covering 1G of file > space. > > Is it possible to make the granule size configurable, then reject a registration if the size is too small or not a power of 2? Then a netfs using the API could try to set equal to rsize, and then error out with a message if the registration was rejected.
On Mon, Aug 10, 2020 at 04:16:59PM +0100, David Howells wrote: > Hi Linus, > > Can you drop the fscache rewrite pull for now. We've seem an issue in NFS > integration and need to rework the read helper a bit. I made an assumption > that fscache will always be able to request that the netfs perform a read of a > certain minimum size - but with NFS you can break that by setting rsize too > small. > > We need to make the read helper able to make multiple netfs reads. This can > help ceph too. FYI, a giant rewrite dropping support for existing consumer is always rather awkward. Is there any way you could pre-stage some infrastructure changes, and then do a temporary fscache2, which could then be renamed back to fscache once everyone switched over?
On Mon, 2020-08-10 at 12:35 -0400, David Wysochanski wrote: > On Mon, Aug 10, 2020 at 11:48 AM David Howells <dhowells@redhat.com> wrote: > > Steve French <smfrench@gmail.com> wrote: > > > > > cifs.ko also can set rsize quite small (even 1K for example, although > > > that will be more than 10x slower than the default 4MB so hopefully no > > > one is crazy enough to do that). > > > > You can set rsize < PAGE_SIZE? > > > > > I can't imagine an SMB3 server negotiating an rsize or wsize smaller than > > > 64K in today's world (and typical is 1MB to 8MB) but the user can specify a > > > much smaller rsize on mount. If 64K is an adequate minimum, we could change > > > the cifs mount option parsing to require a certain minimum rsize if fscache > > > is selected. > > > > I've borrowed the 256K granule size used by various AFS implementations for > > the moment. A 512-byte xattr can thus hold a bitmap covering 1G of file > > space. > > > > > > Is it possible to make the granule size configurable, then reject a > registration if the size is too small or not a power of 2? Then a > netfs using the API could try to set equal to rsize, and then error > out with a message if the registration was rejected. > ...or maybe we should just make fscache incompatible with an rsize that isn't an even multiple of 256k? You need to set mount options for both, typically, so it would be fairly trivial to check this at mount time, I'd think.
On 8/10/20 12:06 PM, Jeff Layton wrote: > On Mon, 2020-08-10 at 12:35 -0400, David Wysochanski wrote: >> On Mon, Aug 10, 2020 at 11:48 AM David Howells <dhowells@redhat.com> wrote: >>> Steve French <smfrench@gmail.com> wrote: >>> >>>> cifs.ko also can set rsize quite small (even 1K for example, although >>>> that will be more than 10x slower than the default 4MB so hopefully no >>>> one is crazy enough to do that). >>> You can set rsize < PAGE_SIZE? >>> >>>> I can't imagine an SMB3 server negotiating an rsize or wsize smaller than >>>> 64K in today's world (and typical is 1MB to 8MB) but the user can specify a >>>> much smaller rsize on mount. If 64K is an adequate minimum, we could change >>>> the cifs mount option parsing to require a certain minimum rsize if fscache >>>> is selected. >>> I've borrowed the 256K granule size used by various AFS implementations for >>> the moment. A 512-byte xattr can thus hold a bitmap covering 1G of file >>> space. >>> >>> >> Is it possible to make the granule size configurable, then reject a >> registration if the size is too small or not a power of 2? Then a >> netfs using the API could try to set equal to rsize, and then error >> out with a message if the registration was rejected. >> > ...or maybe we should just make fscache incompatible with an > rsize that isn't an even multiple of 256k? You need to set mount options > for both, typically, so it would be fairly trivial to check this at > mount time, I'd think. Yes - if fscache is specified on mount it would be easy to round rsize up (or down), at least for cifs.ko (perhaps simply in the mount.cifs helper so a warning could be returned to the user) to whatever boundary you prefer in fscache. The default of 4MB (or 1MB for mounts to some older servers) should be fine. Similarly if the user requested the default but the server negotiated an unusual size, not a multiple of 256K, we could round try to round it down if possible (or fail the mount if not possible to round it down to 256K).
Christoph Hellwig <hch@lst.de> wrote: > FYI, a giant rewrite dropping support for existing consumer is always > rather awkward. Is there any way you could pre-stage some infrastructure > changes, and then do a temporary fscache2, which could then be renamed > back to fscache once everyone switched over? That's a bit tricky. There are three points that would have to be shared: the userspace miscdev interface, the backing filesystem and the single index tree. It's probably easier to just have a go at converting 9P and cifs. Making the old and new APIs share would be a fairly hefty undertaking in its own right. David
David Howells wrote on Thu, Aug 27, 2020: > Christoph Hellwig <hch@lst.de> wrote: > > > FYI, a giant rewrite dropping support for existing consumer is always > > rather awkward. Is there any way you could pre-stage some infrastructure > > changes, and then do a temporary fscache2, which could then be renamed > > back to fscache once everyone switched over? > > That's a bit tricky. There are three points that would have to be shared: the > userspace miscdev interface, the backing filesystem and the single index tree. > > It's probably easier to just have a go at converting 9P and cifs. Making the > old and new APIs share would be a fairly hefty undertaking in its own right. While I agree something incremental is probably better, I have some free time over the next few weeks so will take a shot at 9p; it's definitely going to be easier. Should I submit patches to you or wait until Linus merges it next cycle and send them directly? I see Jeff's ceph patches are still in his tree's ceph-fscache-iter branch and I don't see them anywhere in your tree.
Dominique Martinet <asmadeus@codewreck.org> wrote: > Should I submit patches to you or wait until Linus merges it next cycle > and send them directly? > > I see Jeff's ceph patches are still in his tree's ceph-fscache-iter > branch and I don't see them anywhere in your tree. I really want them to all go in the same window, but there may be a requirement for some filesystem-specific sets (eg. NFS) to go via the maintainer tree. Btw, at the moment, I'm looking at making the fscache read helper support the new ->readahead() op. David