mbox series

[RFC,0/3] xfs: nodataio mount option to skip data I/O

Message ID 20240410140956.1186563-1-bfoster@redhat.com (mailing list archive)
Headers show
Series xfs: nodataio mount option to skip data I/O | expand

Message

Brian Foster April 10, 2024, 2:09 p.m. UTC
Hi all,

bcachefs has a nodataio mount option that is used for isolated metadata
performance testing purposes. When enabled, it performs all metadata I/O
as normal and shortcuts data I/O by directly invoking bio completion.
Kent had asked for something similar for fs comparison purposes some
time ago and I put together a quick hack based around an iomap flag and
mount option for XFS.

I don't recall if I ever posted the initial version and Kent recently
asked about whether we'd want to consider merging something like this. I
think there are at least a couple things that probably need addressing
before that is a viable option.

One is that the mount option is kind of hacky in and of itself. Beyond
that, this mechanism provides a means for stale data exposure because
writes with nodataio mode enabled will operate as if writes were
completed normally (including unwritten extent conversion). Therefore, a
remount to !nodataio mode means we read off whatever was last written to
storage.

Kent mentioned that Eric (or somebody?) had floated the idea of a mkfs
time feature flag or some such to control nodataio mode. That would
avoid mount api changes in general and also disallow use of such
filesystems in a non-nodataio mode, so to me seems like the direction
bcachefs should go with its variant of this regardless.

Personally, I don't have much of an opinion on whether something like
this lands upstream or just remains as a local test hack for isolated
performance testing. The code is simple enough as it is and not really
worth the additional polishing for the latter, but I offered to at least
rebase and post for discussion. Thoughts, reviews, flames appreciated.

Brian

Brian Foster (3):
  iomap: factor out a bio submission helper
  iomap: add nosubmit flag to skip data I/O on iomap mapping
  xfs: add nodataio mount option to skip all data I/O

 fs/iomap/buffered-io.c | 37 ++++++++++++++++++++++++++++---------
 fs/xfs/xfs_iomap.c     |  3 +++
 fs/xfs/xfs_mount.h     |  2 ++
 fs/xfs/xfs_super.c     |  6 +++++-
 include/linux/iomap.h  |  1 +
 5 files changed, 39 insertions(+), 10 deletions(-)

Comments

Kent Overstreet April 10, 2024, 4:17 p.m. UTC | #1
On Wed, Apr 10, 2024 at 10:09:53AM -0400, Brian Foster wrote:
> Hi all,
> 
> bcachefs has a nodataio mount option that is used for isolated metadata
> performance testing purposes. When enabled, it performs all metadata I/O
> as normal and shortcuts data I/O by directly invoking bio completion.
> Kent had asked for something similar for fs comparison purposes some
> time ago and I put together a quick hack based around an iomap flag and
> mount option for XFS.
> 
> I don't recall if I ever posted the initial version and Kent recently
> asked about whether we'd want to consider merging something like this. I
> think there are at least a couple things that probably need addressing
> before that is a viable option.
> 
> One is that the mount option is kind of hacky in and of itself. Beyond
> that, this mechanism provides a means for stale data exposure because
> writes with nodataio mode enabled will operate as if writes were
> completed normally (including unwritten extent conversion). Therefore, a
> remount to !nodataio mode means we read off whatever was last written to
> storage.
> 
> Kent mentioned that Eric (or somebody?) had floated the idea of a mkfs
> time feature flag or some such to control nodataio mode. That would
> avoid mount api changes in general and also disallow use of such
> filesystems in a non-nodataio mode, so to me seems like the direction
> bcachefs should go with its variant of this regardless.
> 
> Personally, I don't have much of an opinion on whether something like
> this lands upstream or just remains as a local test hack for isolated
> performance testing. The code is simple enough as it is and not really
> worth the additional polishing for the latter, but I offered to at least
> rebase and post for discussion. Thoughts, reviews, flames appreciated.
> 
> Brian
> 
> Brian Foster (3):
>   iomap: factor out a bio submission helper
>   iomap: add nosubmit flag to skip data I/O on iomap mapping
>   xfs: add nodataio mount option to skip all data I/O
> 
>  fs/iomap/buffered-io.c | 37 ++++++++++++++++++++++++++++---------
>  fs/xfs/xfs_iomap.c     |  3 +++
>  fs/xfs/xfs_mount.h     |  2 ++
>  fs/xfs/xfs_super.c     |  6 +++++-
>  include/linux/iomap.h  |  1 +
>  5 files changed, 39 insertions(+), 10 deletions(-)
> 
> -- 
> 2.44.0

I'm contemplating add the superblock option to bcachefs as well, that
would fit well with using this for working with metadata dumps too.
("Yes, we know all data checksums are wrong, it's fine").

Another thing that makes this exceedingly useful - SSDs these days are
_garbage_ in terms of getting consistent results. Without this, run to
run variance is ridiculous without a bunch of prep between each test
(that takes longer than the test itself).