mbox series

[00/15] btrfs: read repair/direct I/O improvements

Message ID cover.1583789410.git.osandov@fb.com (mailing list archive)
Headers show
Series btrfs: read repair/direct I/O improvements | expand

Message

Omar Sandoval March 9, 2020, 9:32 p.m. UTC
From: Omar Sandoval <osandov@fb.com>

Hi,

This series includes several fixes, cleanups, and improvements to direct
I/O and read repair. It's preparation for adding read repair to my
RWF_ENCODED series [1], but it can go in independently.

Patches 1 and 2 are direct I/O error handling fixes. Patch 3 is a
buffered read repair fix. Patch 4 is a buffered read repair improvement.
Patches 5-9 are trivial cleanups. Patch 10 converts direct I/O to use
refcount_t, which would've helped catch the bug fixed by patch 1.
Patches 11-14 drastically simplify the direct I/O code. Patch 15 gets
unifies buffered and direct I/O read repair, which also makes direct I/O
repair actually do validation for large failed reads instead of
rewriting the whole thing.

Overall, this is net about -400 lines of code and actually makes direct
I/O more functional.

Note that this series causes btrfs/142 to fail. This is a bug in the
test, as it assumes that direct I/O doesn't do read validation. I'm
working on a fix for the test.

Christoph is cc'd for patch 3. The fix looks at the bio internals in a
way that I wasn't sure was recommended, although there is precedent in
the bcache code. I'd appreciate if Christoph acked that patch or
suggested a better approach.

This series is based on misc-next.

Thanks!

1: https://lore.kernel.org/linux-fsdevel/cover.1582930832.git.osandov@fb.com/

Omar Sandoval (15):
  btrfs: fix error handling when submitting direct I/O bio
  btrfs: fix double __endio_write_update_ordered in direct I/O
  btrfs: look at full bi_io_vec for repair decision
  btrfs: don't do repair validation for checksum errors
  btrfs: clarify btrfs_lookup_bio_sums documentation
  btrfs: rename __readpage_endio_check to check_data_csum
  btrfs: make btrfs_check_repairable() static
  btrfs: move btrfs_dio_private to inode.c
  btrfs: kill btrfs_dio_private->private
  btrfs: convert btrfs_dio_private->pending_bios to refcount_t
  btrfs: put direct I/O checksums in btrfs_dio_private instead of bio
  btrfs: get rid of one layer of bios in direct I/O
  btrfs: simplify direct I/O read repair
  btrfs: get rid of endio_repair_workers
  btrfs: unify buffered and direct I/O read repair

 fs/btrfs/btrfs_inode.h |  30 --
 fs/btrfs/ctree.h       |   1 -
 fs/btrfs/disk-io.c     |   8 +-
 fs/btrfs/disk-io.h     |   1 -
 fs/btrfs/extent_io.c   | 141 +++++----
 fs/btrfs/extent_io.h   |  19 +-
 fs/btrfs/file-item.c   |  11 +-
 fs/btrfs/inode.c       | 694 ++++++++++-------------------------------
 8 files changed, 260 insertions(+), 645 deletions(-)

Comments

Christoph Hellwig March 10, 2020, 4:39 p.m. UTC | #1
Adding Goldwyn,

as he has been looking into converting btrfs to the iomap direct
I/O code.  This doesn't look like a major conflict, but he should
be knowledgeable about the dio code by now after a few iterations
of that.

On Mon, Mar 09, 2020 at 02:32:26PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Hi,
> 
> This series includes several fixes, cleanups, and improvements to direct
> I/O and read repair. It's preparation for adding read repair to my
> RWF_ENCODED series [1], but it can go in independently.
> 
> Patches 1 and 2 are direct I/O error handling fixes. Patch 3 is a
> buffered read repair fix. Patch 4 is a buffered read repair improvement.
> Patches 5-9 are trivial cleanups. Patch 10 converts direct I/O to use
> refcount_t, which would've helped catch the bug fixed by patch 1.
> Patches 11-14 drastically simplify the direct I/O code. Patch 15 gets
> unifies buffered and direct I/O read repair, which also makes direct I/O
> repair actually do validation for large failed reads instead of
> rewriting the whole thing.
> 
> Overall, this is net about -400 lines of code and actually makes direct
> I/O more functional.
> 
> Note that this series causes btrfs/142 to fail. This is a bug in the
> test, as it assumes that direct I/O doesn't do read validation. I'm
> working on a fix for the test.
> 
> Christoph is cc'd for patch 3. The fix looks at the bio internals in a
> way that I wasn't sure was recommended, although there is precedent in
> the bcache code. I'd appreciate if Christoph acked that patch or
> suggested a better approach.
> 
> This series is based on misc-next.
> 
> Thanks!
> 
> 1: https://lore.kernel.org/linux-fsdevel/cover.1582930832.git.osandov@fb.com/
> 
> Omar Sandoval (15):
>   btrfs: fix error handling when submitting direct I/O bio
>   btrfs: fix double __endio_write_update_ordered in direct I/O
>   btrfs: look at full bi_io_vec for repair decision
>   btrfs: don't do repair validation for checksum errors
>   btrfs: clarify btrfs_lookup_bio_sums documentation
>   btrfs: rename __readpage_endio_check to check_data_csum
>   btrfs: make btrfs_check_repairable() static
>   btrfs: move btrfs_dio_private to inode.c
>   btrfs: kill btrfs_dio_private->private
>   btrfs: convert btrfs_dio_private->pending_bios to refcount_t
>   btrfs: put direct I/O checksums in btrfs_dio_private instead of bio
>   btrfs: get rid of one layer of bios in direct I/O
>   btrfs: simplify direct I/O read repair
>   btrfs: get rid of endio_repair_workers
>   btrfs: unify buffered and direct I/O read repair
> 
>  fs/btrfs/btrfs_inode.h |  30 --
>  fs/btrfs/ctree.h       |   1 -
>  fs/btrfs/disk-io.c     |   8 +-
>  fs/btrfs/disk-io.h     |   1 -
>  fs/btrfs/extent_io.c   | 141 +++++----
>  fs/btrfs/extent_io.h   |  19 +-
>  fs/btrfs/file-item.c   |  11 +-
>  fs/btrfs/inode.c       | 694 ++++++++++-------------------------------
>  8 files changed, 260 insertions(+), 645 deletions(-)
> 
> -- 
> 2.25.1
---end quoted text---
Omar Sandoval March 11, 2020, 9:22 a.m. UTC | #2
On Tue, Mar 10, 2020 at 05:39:40PM +0100, Christoph Hellwig wrote:
> Adding Goldwyn,
> 
> as he has been looking into converting btrfs to the iomap direct
> I/O code.  This doesn't look like a major conflict, but he should
> be knowledgeable about the dio code by now after a few iterations
> of that.

Thanks, I did try to avoid conflicting with the iomap rework, and I'm
looking forward to the further improvement.

Btw, Christoph, I'd love to get your opinion on the RWF_ENCODED series
("[PATCH v4 0/9] fs: interface for directly reading/writing compressed
data").
Goldwyn Rodrigues March 18, 2020, 4:33 p.m. UTC | #3
On  2:22 11/03, Omar Sandoval wrote:
> On Tue, Mar 10, 2020 at 05:39:40PM +0100, Christoph Hellwig wrote:
> > Adding Goldwyn,
> > 
> > as he has been looking into converting btrfs to the iomap direct
> > I/O code.  This doesn't look like a major conflict, but he should
> > be knowledgeable about the dio code by now after a few iterations
> > of that.
> 
> Thanks, I did try to avoid conflicting with the iomap rework, and I'm
> looking forward to the further improvement.

Sorry for the late response, I have been on vacation last week and
recovering from backlog. Most of the code does not seem to conflict with
the direct I/O iomap work.

I am ready with my patches in my git [1]. However, I would like to base
my patches on yours. Do you have them in a public git tree anywhere?
preferably with the acks/review comments.

[1] https://github.com/goldwynr/linux/tree/btrfs-iomap-dio
David Sterba March 18, 2020, 10:07 p.m. UTC | #4
On Mon, Mar 09, 2020 at 02:32:26PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> This series includes several fixes, cleanups, and improvements to direct
> I/O and read repair. It's preparation for adding read repair to my
> RWF_ENCODED series [1], but it can go in independently.

Overall it looks good, but also scary. There are some comments to
address, I haven't reviewed it thoroughly yes so please don't resend
unless there's something that would harm testing. I'll put the branch to
for-next for some coverage.
David Sterba March 19, 2020, 2:08 p.m. UTC | #5
On Wed, Mar 18, 2020 at 11:33:18AM -0500, Goldwyn Rodrigues wrote:
> On  2:22 11/03, Omar Sandoval wrote:

> I am ready with my patches in my git [1]. However, I would like to base
> my patches on yours. Do you have them in a public git tree anywhere?
> preferably with the acks/review comments.

I have the branch for testing in my github repository,
https://github.com/kdave/btrfs-devel/tree/ext/omar/dio-fixes .

Regarding the two patchsest, it makes things easier to test at least if
the code does not conflict but I'm a bit worried about adding 2
different reworks of the DIO in one release. OTOH, we' have focus on the
same subsystem, so at this point I'm preferring the option to merge both
unless something serious pops up. The merge target would be 5.8, plenty
of time to revise the desision if needed.
Christoph Hellwig March 20, 2020, 2:10 p.m. UTC | #6
Btw, while you touch this code:

It seems like btrfs_dio_private.subio_endio is rather pointless,
as it is always set to one function for reads and otherwise never
set.  De-virtualizing this call could help making the code a little
faster and easier to understand.
Christoph Hellwig March 20, 2020, 2:11 p.m. UTC | #7
On Fri, Mar 20, 2020 at 03:10:41PM +0100, Christoph Hellwig wrote:
> Btw, while you touch this code:
> 
> It seems like btrfs_dio_private.subio_endio is rather pointless,
> as it is always set to one function for reads and otherwise never
> set.  De-virtualizing this call could help making the code a little
> faster and easier to understand.

Doh, one of the later patches actually kills it.  Time for more coffee..
Omar Sandoval March 20, 2020, 9:29 p.m. UTC | #8
On Wed, Mar 18, 2020 at 11:07:33PM +0100, David Sterba wrote:
> On Mon, Mar 09, 2020 at 02:32:26PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> > This series includes several fixes, cleanups, and improvements to direct
> > I/O and read repair. It's preparation for adding read repair to my
> > RWF_ENCODED series [1], but it can go in independently.
> 
> Overall it looks good, but also scary. There are some comments to
> address, I haven't reviewed it thoroughly yes so please don't resend
> unless there's something that would harm testing. I'll put the branch to
> for-next for some coverage.

Thanks! The only thing so far that might affect testing is the
uninitialized geom.len ASSERT that Nikolay pointed out.