mbox series

[resend,v2,0/2] virtiofs submounts that are still in use forgotten by shrinker

Message ID cover.1696043833.git.kjlx@templeofstupid.com (mailing list archive)
Headers show
Series virtiofs submounts that are still in use forgotten by shrinker | expand

Message

Krister Johansen Oct. 2, 2023, 3:24 p.m. UTC
Hi,
I recently ran into a situation where a virtiofs client began
encountering EBADF after the client / guest system had an OOM.  After
reproducing the issue and debugging, the problem is caused by a
virtiofsd submount having the nodeid of its root dentry fogotten.  This
occurs because it borrows the reference for this dentry from the parent
that is passed into the function.

In this particular case, the submount had been bind mounted into a
container's mount namespace.  The reference count on the original parent
dentry was 0, making it eligible for eviction.  However, because this
dentry was also the last reference the fuse client knew it had, it sent
a forget message to the server.  This caused all future references to
the FUSE node-id from virtiofsd perspective to become invalid.
Subsequent attempts to use the node-id for operations against the
submount's root received an EBADF from the server.

This pair of patches modifies the virtiofs submount code to perform a
lookup on the nodeid that forms the root of the submount.  The patch
before this pulls the revalidate lookup code into a helper function that
can be used both in revalidate and submount superblock fill.

Tested via:

- fstests for virtiofs
- fstests for fuse (against passthrough_ll)
- manual testing to watch how refcounts change between client and server
  in response to filesytem access, umount, and eviction by the shrinker.

This resend has rebased against the latest tip of fuse/for-next and
massaged the commit messages in the patches, but hasn't made any
functional modifications since the original v2.

There's also been an issue opened with the project that uses this
functionality.  More details on that can be found at [1].

Changes since v1:

- Cleanups to pacify test robot

Changes since RFC:

- Modified fuse_fill_super_submount to always fail if dentry cannot be
  revalidated.  (Feedback from Bernd Schubert)
- Fixed up an edge case where looked up but subsequently declared
  invalid dentries were not correctly tracking nlookup.  (Error was
  introduced in my RFC).

Thanks,

-K

[1] https://github.com/kata-containers/kata-containers/issues/8040

Krister Johansen (2):
  fuse: revalidate: move lookup into a separate function
  fuse: ensure that submounts lookup their parent

 fs/fuse/dir.c    | 85 +++++++++++++++++++++++++++++++++---------------
 fs/fuse/fuse_i.h |  6 ++++
 fs/fuse/inode.c  | 43 ++++++++++++++++++++----
 3 files changed, 101 insertions(+), 33 deletions(-)

Comments

Bernd Schubert Oct. 2, 2023, 10:18 p.m. UTC | #1
On 10/2/23 17:24, Krister Johansen wrote:
> Hi,
> I recently ran into a situation where a virtiofs client began
> encountering EBADF after the client / guest system had an OOM.  After
> reproducing the issue and debugging, the problem is caused by a
> virtiofsd submount having the nodeid of its root dentry fogotten.  This
> occurs because it borrows the reference for this dentry from the parent
> that is passed into the function.


Sorry, I didn't forget you, just didn't manage to review the 2nd version 
yet. Will definitely do this week.
Please also note that there will be merge conflicts with atomic open 
patches from Dharmendra/me. Although probably not too difficult to resolve.


Thanks,
Bernd
Krister Johansen Oct. 3, 2023, 4:48 p.m. UTC | #2
On Tue, Oct 03, 2023 at 12:18:42AM +0200, Bernd Schubert wrote:
> 
> 
> On 10/2/23 17:24, Krister Johansen wrote:
> > Hi,
> > I recently ran into a situation where a virtiofs client began
> > encountering EBADF after the client / guest system had an OOM.  After
> > reproducing the issue and debugging, the problem is caused by a
> > virtiofsd submount having the nodeid of its root dentry fogotten.  This
> > occurs because it borrows the reference for this dentry from the parent
> > that is passed into the function.
> 
> 
> Sorry, I didn't forget you, just didn't manage to review the 2nd version
> yet. Will definitely do this week.

Thanks; I appreciate the feedback you've provided so far.

> Please also note that there will be merge conflicts with atomic open patches
> from Dharmendra/me. Although probably not too difficult to resolve.

Sure. I'm happy to reparent, resolve those conflicts, re-test, and send
another revision when we're ready.  I suspect there are going to be
additional changes requested on the v2.  With that in mind, I'll hold
off for the moment unless it is going to cause headaches for you.

For the atomic-open-revalidate changes: should I be working from what's
on the list?  This is the most recent patchset I see:

https://lore.kernel.org/linux-fsdevel/20230920173445.3943581-1-bschubert@ddn.com/

I found a 6.5 relative tree of yours on GitHub by following the libfuse
pull request, but nothing that seemed in sync with fuse/for-next.

Thanks,

-K
Bernd Schubert Oct. 3, 2023, 10:54 p.m. UTC | #3
On 10/3/23 18:48, Krister Johansen wrote:
> On Tue, Oct 03, 2023 at 12:18:42AM +0200, Bernd Schubert wrote:
>>
>>
>> On 10/2/23 17:24, Krister Johansen wrote:
>>> Hi,
>>> I recently ran into a situation where a virtiofs client began
>>> encountering EBADF after the client / guest system had an OOM.  After
>>> reproducing the issue and debugging, the problem is caused by a
>>> virtiofsd submount having the nodeid of its root dentry fogotten.  This
>>> occurs because it borrows the reference for this dentry from the parent
>>> that is passed into the function.
>>
>>
>> Sorry, I didn't forget you, just didn't manage to review the 2nd version
>> yet. Will definitely do this week.
> 
> Thanks; I appreciate the feedback you've provided so far.
> 
>> Please also note that there will be merge conflicts with atomic open patches
>> from Dharmendra/me. Although probably not too difficult to resolve.
> 
> Sure. I'm happy to reparent, resolve those conflicts, re-test, and send
> another revision when we're ready.  I suspect there are going to be
> additional changes requested on the v2.  With that in mind, I'll hold
> off for the moment unless it is going to cause headaches for you.

I certainly also didn't mean that you should check for merge conflicts, 
it was more an annotation that it might come up - depending on the merge 
order. Please don't stop to do improvements, resolving merge conflicts 
shouldn't be difficult.
I'm going to add you to the atomic open patch series to keep you 
updated, if you don't mind.


> 
> For the atomic-open-revalidate changes: should I be working from what's
> on the list?  This is the most recent patchset I see:
> 
> https://lore.kernel.org/linux-fsdevel/20230920173445.3943581-1-bschubert@ddn.com/
> 
> I found a 6.5 relative tree of yours on GitHub by following the libfuse
> pull request, but nothing that seemed in sync with fuse/for-next.

I don't think there are conflicts with fuse-next right now, but I can 
check.


Thanks,
Bernd
Krister Johansen Oct. 4, 2023, 1:58 p.m. UTC | #4
On Wed, Oct 04, 2023 at 12:54:49AM +0200, Bernd Schubert wrote:
> 
> 
> On 10/3/23 18:48, Krister Johansen wrote:
> > On Tue, Oct 03, 2023 at 12:18:42AM +0200, Bernd Schubert wrote:
> > > 
> > > 
> > > On 10/2/23 17:24, Krister Johansen wrote:
> > > > Hi,
> > > > I recently ran into a situation where a virtiofs client began
> > > > encountering EBADF after the client / guest system had an OOM.  After
> > > > reproducing the issue and debugging, the problem is caused by a
> > > > virtiofsd submount having the nodeid of its root dentry fogotten.  This
> > > > occurs because it borrows the reference for this dentry from the parent
> > > > that is passed into the function.
> > > 
> > > Please also note that there will be merge conflicts with atomic open patches
> > > from Dharmendra/me. Although probably not too difficult to resolve.
> > 
> > Sure. I'm happy to reparent, resolve those conflicts, re-test, and send
> > another revision when we're ready.  I suspect there are going to be
> > additional changes requested on the v2.  With that in mind, I'll hold
> > off for the moment unless it is going to cause headaches for you.
> 
> I certainly also didn't mean that you should check for merge conflicts, it
> was more an annotation that it might come up - depending on the merge order.
> Please don't stop to do improvements, resolving merge conflicts shouldn't be
> difficult.
> I'm going to add you to the atomic open patch series to keep you updated, if
> you don't mind.

Thanks, no objections from me.  I'm willing to help with any conflict
resolution or retesting tasks, if anything turns out to be non-trivial.
My goal is to get these patches to the state where they're acceptable.
I'm happy to make additional changes, or work against a different
branch.


-K