Message ID | 20200807195526.426056-19-vgoyal@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtiofs: Add DAX support | expand |
On Fri, Aug 7, 2020 at 9:55 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > fuse_file_put(sync) can be called with sync=true/false. If sync=true, > it waits for release request response and then calls iput() in the > caller's context. If sync=false, it does not wait for release request > response, frees the fuse_file struct immediately and req->end function > does the iput(). > > iput() can be a problem with DAX if called in req->end context. If this > is last reference to inode (VFS has let go its reference already), then > iput() will clean DAX mappings as well and send REMOVEMAPPING requests > and wait for completion. (All the the worker thread context which is > processing fuse replies from daemon on the host). > > That means it blocks worker thread and it stops processing further > replies and system deadlocks. Is this reasoning specific to DAX? Seems to me this is a virtio-fs specific issue. Thanks, Miklos
On Mon, Aug 10, 2020 at 10:29:13AM +0200, Miklos Szeredi wrote: > On Fri, Aug 7, 2020 at 9:55 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > > > fuse_file_put(sync) can be called with sync=true/false. If sync=true, > > it waits for release request response and then calls iput() in the > > caller's context. If sync=false, it does not wait for release request > > response, frees the fuse_file struct immediately and req->end function > > does the iput(). > > > > iput() can be a problem with DAX if called in req->end context. If this > > is last reference to inode (VFS has let go its reference already), then > > iput() will clean DAX mappings as well and send REMOVEMAPPING requests > > and wait for completion. (All the the worker thread context which is > > processing fuse replies from daemon on the host). > > > > That means it blocks worker thread and it stops processing further > > replies and system deadlocks. > > Is this reasoning specific to DAX? Seems to me this is a virtio-fs > specific issue. I would think it is virtio-fs + DAX issues. virtio-fs without DAX does not have this problem. If making this conditional on DAX an issue, I am wondering, can we now set args->may_block instead. Now virtiofs will do completion in a separate worker thread if ->may_block is set. That means, worker will block till REMOVEMAPPING completes and then be woken up. Do let me know if you like setting args->may_block approach better. I can give that a try. Vivek
On Mon, Aug 10, 2020 at 10:29:13AM +0200, Miklos Szeredi wrote: > On Fri, Aug 7, 2020 at 9:55 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > > > fuse_file_put(sync) can be called with sync=true/false. If sync=true, > > it waits for release request response and then calls iput() in the > > caller's context. If sync=false, it does not wait for release request > > response, frees the fuse_file struct immediately and req->end function > > does the iput(). > > > > iput() can be a problem with DAX if called in req->end context. If this > > is last reference to inode (VFS has let go its reference already), then > > iput() will clean DAX mappings as well and send REMOVEMAPPING requests > > and wait for completion. (All the the worker thread context which is > > processing fuse replies from daemon on the host). > > > > That means it blocks worker thread and it stops processing further > > replies and system deadlocks. > > Is this reasoning specific to DAX? Seems to me this is a virtio-fs > specific issue. Hi Miklos, I am looking at this patch closely and maybe we don't need it, given current code. Reason being that looks like for virtiofs now we set ctx->destroy=true and that sets fc->destroy. And if that's set, we don't schedule async req in fuse_release_common(); fuse_release_common() { fuse_file_put(ff, ff->fc->destroy, isdir); } And if we don't schedule async file put, we don't run the risk of blocking worker thread in evict_inode() with DAX enabled. I ran blogbench multiple times with this patch reverted and I can't reproduce the deadlock. So we can drop this patch for now. If need be I will fix it as post merge patch. Thanks Vivek
On Mon, Aug 10, 2020 at 9:37 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > On Mon, Aug 10, 2020 at 10:29:13AM +0200, Miklos Szeredi wrote: > > On Fri, Aug 7, 2020 at 9:55 PM Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > fuse_file_put(sync) can be called with sync=true/false. If sync=true, > > > it waits for release request response and then calls iput() in the > > > caller's context. If sync=false, it does not wait for release request > > > response, frees the fuse_file struct immediately and req->end function > > > does the iput(). > > > > > > iput() can be a problem with DAX if called in req->end context. If this > > > is last reference to inode (VFS has let go its reference already), then > > > iput() will clean DAX mappings as well and send REMOVEMAPPING requests > > > and wait for completion. (All the the worker thread context which is > > > processing fuse replies from daemon on the host). > > > > > > That means it blocks worker thread and it stops processing further > > > replies and system deadlocks. > > > > Is this reasoning specific to DAX? Seems to me this is a virtio-fs > > specific issue. > > Hi Miklos, > > I am looking at this patch closely and maybe we don't need it, given > current code. > > Reason being that looks like for virtiofs now we set ctx->destroy=true > and that sets fc->destroy. And if that's set, we don't schedule async req > in fuse_release_common(); > > fuse_release_common() { > fuse_file_put(ff, ff->fc->destroy, isdir); > } > > And if we don't schedule async file put, we don't run the risk of > blocking worker thread in evict_inode() with DAX enabled. > > I ran blogbench multiple times with this patch reverted and I can't > reproduce the deadlock. > > So we can drop this patch for now. If need be I will fix it as > post merge patch. Okay. Yeah, I was thinking along the lines of no-async-release-ever for virtiofs, but apparently that's already done. Thanks, Miklos
diff --git a/fs/fuse/file.c b/fs/fuse/file.c index ecd2a42f6e30..605976a586c2 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -527,6 +527,7 @@ void fuse_release_common(struct file *file, bool isdir) struct fuse_file *ff = file->private_data; struct fuse_release_args *ra = ff->release_args; int opcode = isdir ? FUSE_RELEASEDIR : FUSE_RELEASE; + bool sync = false; fuse_prepare_release(fi, ff, file->f_flags, opcode); @@ -546,8 +547,19 @@ void fuse_release_common(struct file *file, bool isdir) * Make the release synchronous if this is a fuseblk mount, * synchronous RELEASE is allowed (and desirable) in this case * because the server can be trusted not to screw up. + * + * For DAX, fuse server is trusted. So it should be fine to + * do a sync file put. Doing async file put is creating + * problems right now because when request finish, iput() + * can lead to freeing of inode. That means it tears down + * mappings backing DAX memory and sends REMOVEMAPPING message + * to server and blocks for completion. Currently, waiting + * in req->end context deadlocks the system as same worker thread + * can't process REMOVEMAPPING reply it is waiting for. */ - fuse_file_put(ff, ff->fc->destroy, isdir); + if (IS_DAX(file_inode(file)) || ff->fc->destroy) + sync = true; + fuse_file_put(ff, sync, isdir); } static int fuse_open(struct inode *inode, struct file *file)
fuse_file_put(sync) can be called with sync=true/false. If sync=true, it waits for release request response and then calls iput() in the caller's context. If sync=false, it does not wait for release request response, frees the fuse_file struct immediately and req->end function does the iput(). iput() can be a problem with DAX if called in req->end context. If this is last reference to inode (VFS has let go its reference already), then iput() will clean DAX mappings as well and send REMOVEMAPPING requests and wait for completion. (All the the worker thread context which is processing fuse replies from daemon on the host). That means it blocks worker thread and it stops processing further replies and system deadlocks. So for now, force sync release of file in case of DAX inodes. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> --- fs/fuse/file.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)