mbox series

[0/3] cifs: cache directory content for shroot

Message ID 20201001205026.8808-1-lsahlber@redhat.com (mailing list archive)
Headers show
Series cifs: cache directory content for shroot | expand

Message

Ronnie Sahlberg Oct. 1, 2020, 8:50 p.m. UTC
Steve, List

See initial implementation of a mechanism to cache the directory entries for
a shared cache handle (shroot).
We cache all the entries during the initial readdir() scan, using the context
from the vfs layer as the key to handle if there are multiple concurrent readir() scans
of the same directory.
Then if/when we have successfully cached the entire direcotry we will server any
subsequent readdir() from out of cache, avoinding making any query direcotry calls to the server.

As with all of shroot, the cache is kept until the direcotry lease is broken.


The first two patches are small and just a preparation for the third patch. They go as separate
patches to make review easier.
The third patch adds the actual meat of the dirent caching .


For now this might not be too exciting because the only cache the root handle.
I hope in the future we will expand the directory caching to handle any/many direcotries.

Comments

Steve French Oct. 2, 2020, 5:07 a.m. UTC | #1
My initial test of this was to Windows 10, doing ls of the root directory.

During mount as expected I see the initial compounded smb3.1.1
create/query(file_all_info) work and get a directory lease (read
lease) on the root directory

Then doing the ls
I see the expected ls of the root directory query dir return
successfully, but it is a compounded open/query-dir (no lease
requested) not a query dir using the already open root file handle.
We should find a way to allow it to use the root file handle if
possible although perhaps less important when the caching code is
fully working as it will only be requested once.

The next ls though does an open/query but then does stat of all the
files (which is bad, lots of compounded open/query-info/close).  Then
the next ls will do open/query-dir

So the patch set is promising but currently isn't caching the root
directory contents in a way that is recognized by the subsequent ls
commands.   I will try to look at this more this weekend - but let me
know if updated version of the patchset - it will be very, very useful
when we can get this working - very exciting.

On Thu, Oct 1, 2020 at 3:50 PM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
>
> Steve, List
>
> See initial implementation of a mechanism to cache the directory entries for
> a shared cache handle (shroot).
> We cache all the entries during the initial readdir() scan, using the context
> from the vfs layer as the key to handle if there are multiple concurrent readir() scans
> of the same directory.
> Then if/when we have successfully cached the entire direcotry we will server any
> subsequent readdir() from out of cache, avoinding making any query direcotry calls to the server.
>
> As with all of shroot, the cache is kept until the direcotry lease is broken.
>
>
> The first two patches are small and just a preparation for the third patch. They go as separate
> patches to make review easier.
> The third patch adds the actual meat of the dirent caching .
>
>
> For now this might not be too exciting because the only cache the root handle.
> I hope in the future we will expand the directory caching to handle any/many direcotries.
>
ronnie sahlberg Oct. 2, 2020, 8:04 a.m. UTC | #2
On Fri, Oct 2, 2020 at 3:08 PM Steve French <smfrench@gmail.com> wrote:
>
> My initial test of this was to Windows 10, doing ls of the root directory.
>
> During mount as expected I see the initial compounded smb3.1.1
> create/query(file_all_info) work and get a directory lease (read
> lease) on the root directory
>
> Then doing the ls
> I see the expected ls of the root directory query dir return
> successfully, but it is a compounded open/query-dir (no lease
> requested) not a query dir using the already open root file handle.
> We should find a way to allow it to use the root file handle if
> possible although perhaps less important when the caching code is
> fully working as it will only be requested once.

That is something we can improve as aesthetics. Yes, we should try to
use already open directory handles if ones are available.
Right now this is probably low priority as we only cache the root directory.
When we expand this to other directories as well,   maybe cashe with a
lease the last n directories? and not just ""
we can do this and likely see improvements.

>
> The next ls though does an open/query but then does stat of all the
> files (which is bad, lots of compounded open/query-info/close).  Then
> the next ls will do open/query-dir

I don't think we can avoid this. The directory lease AFAIK only
triggers and breaks on when the directory itself is modified,
i.e. dirents are added/deleted/renamed   but not is pre-existing
direntries have changes to their inodes.

I.e. does a directory lease break just because an existing file in it
was extended? Does the lease break if an immediate subdirectory have
files added/removed, i.e. st_nlink changes, not for the directory we
have a lease on but a subdirectory where we do not  have a lease?
>
> So the patch set is promising but currently isn't caching the root
> directory contents in a way that is recognized by the subsequent ls
> commands.   I will try to look at this more this weekend - but let me
> know if updated version of the patchset - it will be very, very useful
> when we can get this working - very exciting.
>
> On Thu, Oct 1, 2020 at 3:50 PM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
> >
> > Steve, List
> >
> > See initial implementation of a mechanism to cache the directory entries for
> > a shared cache handle (shroot).
> > We cache all the entries during the initial readdir() scan, using the context
> > from the vfs layer as the key to handle if there are multiple concurrent readir() scans
> > of the same directory.
> > Then if/when we have successfully cached the entire direcotry we will server any
> > subsequent readdir() from out of cache, avoinding making any query direcotry calls to the server.
> >
> > As with all of shroot, the cache is kept until the direcotry lease is broken.
> >
> >
> > The first two patches are small and just a preparation for the third patch. They go as separate
> > patches to make review easier.
> > The third patch adds the actual meat of the dirent caching .
> >
> >
> > For now this might not be too exciting because the only cache the root handle.
> > I hope in the future we will expand the directory caching to handle any/many direcotries.
> >
>
>
> --
> Thanks,
>
> Steve
ronnie sahlberg Oct. 2, 2020, 8:07 a.m. UTC | #3
On Fri, Oct 2, 2020 at 6:04 PM ronnie sahlberg <ronniesahlberg@gmail.com> wrote:
>
> On Fri, Oct 2, 2020 at 3:08 PM Steve French <smfrench@gmail.com> wrote:
> >
> > My initial test of this was to Windows 10, doing ls of the root directory.
> >
> > During mount as expected I see the initial compounded smb3.1.1
> > create/query(file_all_info) work and get a directory lease (read
> > lease) on the root directory
> >
> > Then doing the ls
> > I see the expected ls of the root directory query dir return
> > successfully, but it is a compounded open/query-dir (no lease
> > requested) not a query dir using the already open root file handle.
> > We should find a way to allow it to use the root file handle if
> > possible although perhaps less important when the caching code is
> > fully working as it will only be requested once.
>
> That is something we can improve as aesthetics. Yes, we should try to
> use already open directory handles if ones are available.
> Right now this is probably low priority as we only cache the root directory.
> When we expand this to other directories as well,   maybe cashe with a
> lease the last n directories? and not just ""
> we can do this and likely see improvements.
>
> >
> > The next ls though does an open/query but then does stat of all the
> > files (which is bad, lots of compounded open/query-info/close).  Then
> > the next ls will do open/query-dir
>
> I don't think we can avoid this. The directory lease AFAIK only
> triggers and breaks on when the directory itself is modified,
> i.e. dirents are added/deleted/renamed   but not is pre-existing
> direntries have changes to their inodes.
>
> I.e. does a directory lease break just because an existing file in it
> was extended? Does the lease break if an immediate subdirectory have
> files added/removed, i.e. st_nlink changes, not for the directory we
> have a lease on but a subdirectory where we do not  have a lease?

I mean, even with directory caching ls -l would still need to stat()
each dirent ?

> >
> > So the patch set is promising but currently isn't caching the root
> > directory contents in a way that is recognized by the subsequent ls
> > commands.   I will try to look at this more this weekend - but let me
> > know if updated version of the patchset - it will be very, very useful
> > when we can get this working - very exciting.
> >
> > On Thu, Oct 1, 2020 at 3:50 PM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
> > >
> > > Steve, List
> > >
> > > See initial implementation of a mechanism to cache the directory entries for
> > > a shared cache handle (shroot).
> > > We cache all the entries during the initial readdir() scan, using the context
> > > from the vfs layer as the key to handle if there are multiple concurrent readir() scans
> > > of the same directory.
> > > Then if/when we have successfully cached the entire direcotry we will server any
> > > subsequent readdir() from out of cache, avoinding making any query direcotry calls to the server.
> > >
> > > As with all of shroot, the cache is kept until the direcotry lease is broken.
> > >
> > >
> > > The first two patches are small and just a preparation for the third patch. They go as separate
> > > patches to make review easier.
> > > The third patch adds the actual meat of the dirent caching .
> > >
> > >
> > > For now this might not be too exciting because the only cache the root handle.
> > > I hope in the future we will expand the directory caching to handle any/many direcotries.
> > >
> >
> >
> > --
> > Thanks,
> >
> > Steve
Steve French Oct. 2, 2020, 2:13 p.m. UTC | #4
query dir is doing a FIND_ID_FULL_DIRECTORY_INFO and stat is doing a
FILE_ALL_INFO?  What are we missing from the query dir response that
we would get from FILE_ALL_INFO in the query info one by one?

On Fri, Oct 2, 2020 at 3:07 AM ronnie sahlberg <ronniesahlberg@gmail.com> wrote:
>
> On Fri, Oct 2, 2020 at 6:04 PM ronnie sahlberg <ronniesahlberg@gmail.com> wrote:
> >
> > On Fri, Oct 2, 2020 at 3:08 PM Steve French <smfrench@gmail.com> wrote:
> > >
> > > My initial test of this was to Windows 10, doing ls of the root directory.
> > >
> > > During mount as expected I see the initial compounded smb3.1.1
> > > create/query(file_all_info) work and get a directory lease (read
> > > lease) on the root directory
> > >
> > > Then doing the ls
> > > I see the expected ls of the root directory query dir return
> > > successfully, but it is a compounded open/query-dir (no lease
> > > requested) not a query dir using the already open root file handle.
> > > We should find a way to allow it to use the root file handle if
> > > possible although perhaps less important when the caching code is
> > > fully working as it will only be requested once.
> >
> > That is something we can improve as aesthetics. Yes, we should try to
> > use already open directory handles if ones are available.
> > Right now this is probably low priority as we only cache the root directory.
> > When we expand this to other directories as well,   maybe cashe with a
> > lease the last n directories? and not just ""
> > we can do this and likely see improvements.
> >
> > >
> > > The next ls though does an open/query but then does stat of all the
> > > files (which is bad, lots of compounded open/query-info/close).  Then
> > > the next ls will do open/query-dir
> >
> > I don't think we can avoid this. The directory lease AFAIK only
> > triggers and breaks on when the directory itself is modified,
> > i.e. dirents are added/deleted/renamed   but not is pre-existing
> > direntries have changes to their inodes.
> >
> > I.e. does a directory lease break just because an existing file in it
> > was extended? Does the lease break if an immediate subdirectory have
> > files added/removed, i.e. st_nlink changes, not for the directory we
> > have a lease on but a subdirectory where we do not  have a lease?
>
> I mean, even with directory caching ls -l would still need to stat()
> each dirent ?
>
> > >
> > > So the patch set is promising but currently isn't caching the root
> > > directory contents in a way that is recognized by the subsequent ls
> > > commands.   I will try to look at this more this weekend - but let me
> > > know if updated version of the patchset - it will be very, very useful
> > > when we can get this working - very exciting.
> > >
> > > On Thu, Oct 1, 2020 at 3:50 PM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
> > > >
> > > > Steve, List
> > > >
> > > > See initial implementation of a mechanism to cache the directory entries for
> > > > a shared cache handle (shroot).
> > > > We cache all the entries during the initial readdir() scan, using the context
> > > > from the vfs layer as the key to handle if there are multiple concurrent readir() scans
> > > > of the same directory.
> > > > Then if/when we have successfully cached the entire direcotry we will server any
> > > > subsequent readdir() from out of cache, avoinding making any query direcotry calls to the server.
> > > >
> > > > As with all of shroot, the cache is kept until the direcotry lease is broken.
> > > >
> > > >
> > > > The first two patches are small and just a preparation for the third patch. They go as separate
> > > > patches to make review easier.
> > > > The third patch adds the actual meat of the dirent caching .
> > > >
> > > >
> > > > For now this might not be too exciting because the only cache the root handle.
> > > > I hope in the future we will expand the directory caching to handle any/many direcotries.
> > > >
> > >
> > >
> > > --
> > > Thanks,
> > >
> > > Steve