Message ID | 20180627201231.15641-1-bart.vanassche@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Jun 27, 2018 at 01:12:31PM -0700, Bart Van Assche wrote: > Although __bio_clone_fast() copies bi_io_vec, it does not copy bi_vcnt, > the number of elements in bi_io_vec[] that contains data. Copy bi_vcnt > such that code that needs this member behaves identically for original > and for cloned requests. > > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Mike Snitzer <snitzer@redhat.com> > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Hannes Reinecke <hare@suse.com> > Cc: Johannes Thumshirn <jthumshirn@suse.de> > --- > block/bio.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/block/bio.c b/block/bio.c > index f7e3d88bd0b6..55f8e0dedd69 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -605,6 +605,7 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src) > bio->bi_opf = bio_src->bi_opf; > bio->bi_write_hint = bio_src->bi_write_hint; > bio->bi_iter = bio_src->bi_iter; > + bio->bi_vcnt = bio_src->bi_vcnt; > bio->bi_io_vec = bio_src->bi_io_vec; No, don't do that. For a cloned bio, both .bi_vcnt and .bi_io_vec can't be used directly, and we have done huge such cleanup for supporting multipage bvec, that is the great idea of immutable bvecs invented by Kent. Thanks, Ming
On 06/27/18 16:50, Ming Lei wrote: > On Wed, Jun 27, 2018 at 01:12:31PM -0700, Bart Van Assche wrote: >> Although __bio_clone_fast() copies bi_io_vec, it does not copy bi_vcnt, >> the number of elements in bi_io_vec[] that contains data. Copy bi_vcnt >> such that code that needs this member behaves identically for original >> and for cloned requests. >> >> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> >> Cc: Christoph Hellwig <hch@lst.de> >> Cc: Mike Snitzer <snitzer@redhat.com> >> Cc: Ming Lei <ming.lei@redhat.com> >> Cc: Hannes Reinecke <hare@suse.com> >> Cc: Johannes Thumshirn <jthumshirn@suse.de> >> --- >> block/bio.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/block/bio.c b/block/bio.c >> index f7e3d88bd0b6..55f8e0dedd69 100644 >> --- a/block/bio.c >> +++ b/block/bio.c >> @@ -605,6 +605,7 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src) >> bio->bi_opf = bio_src->bi_opf; >> bio->bi_write_hint = bio_src->bi_write_hint; >> bio->bi_iter = bio_src->bi_iter; >> + bio->bi_vcnt = bio_src->bi_vcnt; >> bio->bi_io_vec = bio_src->bi_io_vec; > > No, don't do that. Why not? I think it's a huge booby trap that cloned bio's have a bi_io_vec but zero bi_vcnt. > For a cloned bio, both .bi_vcnt and .bi_io_vec can't be used directly, > and we have done huge such cleanup for supporting multipage bvec, that is > the great idea of immutable bvecs invented by Kent. My understanding is that the above change doesn't conflict at all with the immutable biovec work -- to the contrary. Bart.
On Wed, Jun 27, 2018 at 04:59:41PM -0700, Bart Van Assche wrote: > On 06/27/18 16:50, Ming Lei wrote: > > On Wed, Jun 27, 2018 at 01:12:31PM -0700, Bart Van Assche wrote: > > > Although __bio_clone_fast() copies bi_io_vec, it does not copy bi_vcnt, > > > the number of elements in bi_io_vec[] that contains data. Copy bi_vcnt > > > such that code that needs this member behaves identically for original > > > and for cloned requests. > > > > > > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> > > > Cc: Christoph Hellwig <hch@lst.de> > > > Cc: Mike Snitzer <snitzer@redhat.com> > > > Cc: Ming Lei <ming.lei@redhat.com> > > > Cc: Hannes Reinecke <hare@suse.com> > > > Cc: Johannes Thumshirn <jthumshirn@suse.de> > > > --- > > > block/bio.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/block/bio.c b/block/bio.c > > > index f7e3d88bd0b6..55f8e0dedd69 100644 > > > --- a/block/bio.c > > > +++ b/block/bio.c > > > @@ -605,6 +605,7 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src) > > > bio->bi_opf = bio_src->bi_opf; > > > bio->bi_write_hint = bio_src->bi_write_hint; > > > bio->bi_iter = bio_src->bi_iter; > > > + bio->bi_vcnt = bio_src->bi_vcnt; > > > bio->bi_io_vec = bio_src->bi_io_vec; > > > > No, don't do that. > > Why not? I think it's a huge booby trap that cloned bio's have a bi_io_vec > but zero bi_vcnt. One core idea of immutable bvec is to use bio->bi_iter and the original bvec table to iterate over anywhere in the bio. That is why .bi_io_vec needs to copy, but not see any reason why .bi_vcnt needs to do. Do you have use cases on .bi_vcnt for cloned bio? Thanks, Ming
On 06/27/18 17:30, Ming Lei wrote: > One core idea of immutable bvec is to use bio->bi_iter and the original > bvec table to iterate over anywhere in the bio. That is why .bi_io_vec > needs to copy, but not see any reason why .bi_vcnt needs to do. > > Do you have use cases on .bi_vcnt for cloned bio? So far this is only a theoretical concern. There are many functions in the block layer that use .bi_vcnt, and it is a lot of work to figure out all the callers of all these functions. Bart.
On Thu, Jun 28 2018 at 11:21am -0400, Bart Van Assche <bart.vanassche@wdc.com> wrote: > On 06/27/18 17:30, Ming Lei wrote: > >One core idea of immutable bvec is to use bio->bi_iter and the original > >bvec table to iterate over anywhere in the bio. That is why .bi_io_vec > >needs to copy, but not see any reason why .bi_vcnt needs to do. > > > >Do you have use cases on .bi_vcnt for cloned bio? > > So far this is only a theoretical concern. There are many functions > in the block layer that use .bi_vcnt, and it is a lot of work to > figure out all the callers of all these functions. No point wasting time with that. I don't understand why Ming cares. Your change is obviously correct. The state should get transfered over to reflect reality. This patch doesn't harm anything, it just prevents some clone-specific failure in the future. Acked-by: Mike Snitzer <snitzer@redhat.com>
On 6/27/18 2:12 PM, Bart Van Assche wrote: > Although __bio_clone_fast() copies bi_io_vec, it does not copy bi_vcnt, > the number of elements in bi_io_vec[] that contains data. Copy bi_vcnt > such that code that needs this member behaves identically for original > and for cloned requests. Applied - it's correct for the current base.
On Thu, Jun 28, 2018 at 09:53:23AM -0600, Jens Axboe wrote: > On 6/27/18 2:12 PM, Bart Van Assche wrote: > > Although __bio_clone_fast() copies bi_io_vec, it does not copy bi_vcnt, > > the number of elements in bi_io_vec[] that contains data. Copy bi_vcnt > > such that code that needs this member behaves identically for original > > and for cloned requests. > > Applied - it's correct for the current base. > Any users of cloned bio shouldn't use this bio's .bi_vcnt directly, since this counter represents the original bio's actual io vector number, nothing related with the cloned bio. So I don't understand why we need to copy it. Thanks, Ming
On Thu, Jun 28, 2018 at 11:21 PM, Bart Van Assche <bart.vanassche@wdc.com> wrote: > On 06/27/18 17:30, Ming Lei wrote: >> >> One core idea of immutable bvec is to use bio->bi_iter and the original >> bvec table to iterate over anywhere in the bio. That is why .bi_io_vec >> needs to copy, but not see any reason why .bi_vcnt needs to do. >> >> Do you have use cases on .bi_vcnt for cloned bio? > > > So far this is only a theoretical concern. There are many functions in the > block layer that use .bi_vcnt, and it is a lot of work to figure out all the > callers of all these functions. No, any functions using .bi_vcnt on a cloned-bio may be a bug, and we should take a close look. Thanks, Ming Lei
On Fri, Jun 29, 2018 at 07:10:47AM +0800, Ming Lei wrote: > On Thu, Jun 28, 2018 at 11:21 PM, Bart Van Assche > <bart.vanassche@wdc.com> wrote: > > On 06/27/18 17:30, Ming Lei wrote: > >> > >> One core idea of immutable bvec is to use bio->bi_iter and the original > >> bvec table to iterate over anywhere in the bio. That is why .bi_io_vec > >> needs to copy, but not see any reason why .bi_vcnt needs to do. > >> > >> Do you have use cases on .bi_vcnt for cloned bio? > > > > > > So far this is only a theoretical concern. There are many functions in the > > block layer that use .bi_vcnt, and it is a lot of work to figure out all the > > callers of all these functions. Back when I implemented immutable biovecs I thoroughly audited all the bi_vcnt uses and removed all of them that weren't by the code that owns/submits the bio. Grepping around I see one or two suspicious uses.. blk-merge.c in particular > No, any functions using .bi_vcnt on a cloned-bio may be a bug, and we should > take a close look. not just cloned bios, any code using bi_vcnt on a bio it didn't create is wrong. so big nack to this patch (I wasn't ccd on it though and it doesn't seem to have hit lkml, so I can't find the original patch...)
On 06/28/18 16:16, Kent Overstreet wrote: > On Fri, Jun 29, 2018 at 07:10:47AM +0800, Ming Lei wrote: >> On Thu, Jun 28, 2018 at 11:21 PM, Bart Van Assche >> <bart.vanassche@wdc.com> wrote: >>> On 06/27/18 17:30, Ming Lei wrote: >>>> >>>> One core idea of immutable bvec is to use bio->bi_iter and the original >>>> bvec table to iterate over anywhere in the bio. That is why .bi_io_vec >>>> needs to copy, but not see any reason why .bi_vcnt needs to do. >>>> >>>> Do you have use cases on .bi_vcnt for cloned bio? >>> >>> So far this is only a theoretical concern. There are many functions in the >>> block layer that use .bi_vcnt, and it is a lot of work to figure out all the >>> callers of all these functions. > > Back when I implemented immutable biovecs I thoroughly audited all the bi_vcnt > uses and removed all of them that weren't by the code that owns/submits the bio. > > Grepping around I see one or two suspicious uses.. blk-merge.c in particular > >> No, any functions using .bi_vcnt on a cloned-bio may be a bug, and we should >> take a close look. > > not just cloned bios, any code using bi_vcnt on a bio it didn't create is wrong. > > so big nack to this patch (I wasn't ccd on it though and it doesn't seem to have > hit lkml, so I can't find the original patch...) Hello Kent, Thanks for chiming in. The linux-block mailing list is archived by multiple websites. The entire e-mail thread is available on e.g. https://www.mail-archive.com/linux-block@vger.kernel.org/msg23006.html. I have a question for you: at least in kernel v4.17 bio_clone_bioset() copies bi_vcnt from the source to the destination bio. However, __bio_clone_fast() doesn't copy bi_vcnt. Isn't that an inconsistency? Thanks, Bart.
On Thu, Jun 28, 2018 at 04:54:44PM -0700, Bart Van Assche wrote: > On 06/28/18 16:16, Kent Overstreet wrote: > > On Fri, Jun 29, 2018 at 07:10:47AM +0800, Ming Lei wrote: > > > On Thu, Jun 28, 2018 at 11:21 PM, Bart Van Assche > > > <bart.vanassche@wdc.com> wrote: > > > > On 06/27/18 17:30, Ming Lei wrote: > > > > > > > > > > One core idea of immutable bvec is to use bio->bi_iter and the original > > > > > bvec table to iterate over anywhere in the bio. That is why .bi_io_vec > > > > > needs to copy, but not see any reason why .bi_vcnt needs to do. > > > > > > > > > > Do you have use cases on .bi_vcnt for cloned bio? > > > > > > > > So far this is only a theoretical concern. There are many functions in the > > > > block layer that use .bi_vcnt, and it is a lot of work to figure out all the > > > > callers of all these functions. > > > > Back when I implemented immutable biovecs I thoroughly audited all the bi_vcnt > > uses and removed all of them that weren't by the code that owns/submits the bio. > > > > Grepping around I see one or two suspicious uses.. blk-merge.c in particular > > > > > No, any functions using .bi_vcnt on a cloned-bio may be a bug, and we should > > > take a close look. > > > > not just cloned bios, any code using bi_vcnt on a bio it didn't create is wrong. > > > > so big nack to this patch (I wasn't ccd on it though and it doesn't seem to have > > hit lkml, so I can't find the original patch...) > > Hello Kent, > > Thanks for chiming in. The linux-block mailing list is archived by multiple > websites. The entire e-mail thread is available on e.g. > https://www.mail-archive.com/linux-block@vger.kernel.org/msg23006.html. > > I have a question for you: at least in kernel v4.17 bio_clone_bioset() > copies bi_vcnt from the source to the destination bio. However, > __bio_clone_fast() doesn't copy bi_vcnt. Isn't that an inconsistency? No - when you use bio_clone_bioset() you get a bio that you own and can do whatever you want with, so it does make sense for it to initialize bi_vcnt. e.g. you could use bio_clone_bioset() when you're going to be bouncing a bio, iterating over each bvec and allocating a new page and copying data from the old page to the new page.
On 6/28/18 5:16 PM, Kent Overstreet wrote: > On Fri, Jun 29, 2018 at 07:10:47AM +0800, Ming Lei wrote: >> On Thu, Jun 28, 2018 at 11:21 PM, Bart Van Assche >> <bart.vanassche@wdc.com> wrote: >>> On 06/27/18 17:30, Ming Lei wrote: >>>> >>>> One core idea of immutable bvec is to use bio->bi_iter and the original >>>> bvec table to iterate over anywhere in the bio. That is why .bi_io_vec >>>> needs to copy, but not see any reason why .bi_vcnt needs to do. >>>> >>>> Do you have use cases on .bi_vcnt for cloned bio? >>> >>> >>> So far this is only a theoretical concern. There are many functions in the >>> block layer that use .bi_vcnt, and it is a lot of work to figure out all the >>> callers of all these functions. > > Back when I implemented immutable biovecs I thoroughly audited all the bi_vcnt > uses and removed all of them that weren't by the code that owns/submits the bio. > > Grepping around I see one or two suspicious uses.. blk-merge.c in particular > >> No, any functions using .bi_vcnt on a cloned-bio may be a bug, and we should >> take a close look. > > not just cloned bios, any code using bi_vcnt on a bio it didn't create is wrong. > > so big nack to this patch (I wasn't ccd on it though and it doesn't seem to have > hit lkml, so I can't find the original patch...) Yeah, you are both right, I was smoking crack.
On 06/28/18 17:04, Kent Overstreet wrote: > On Thu, Jun 28, 2018 at 04:54:44PM -0700, Bart Van Assche wrote: >> Thanks for chiming in. The linux-block mailing list is archived by multiple >> websites. The entire e-mail thread is available on e.g. >> https://www.mail-archive.com/linux-block@vger.kernel.org/msg23006.html. >> >> I have a question for you: at least in kernel v4.17 bio_clone_bioset() >> copies bi_vcnt from the source to the destination bio. However, >> __bio_clone_fast() doesn't copy bi_vcnt. Isn't that an inconsistency? > > No - when you use bio_clone_bioset() you get a bio that you own and can do > whatever you want with, so it does make sense for it to initialize bi_vcnt. > > e.g. you could use bio_clone_bioset() when you're going to be bouncing a bio, > iterating over each bvec and allocating a new page and copying data from the old > page to the new page. Hello Kent, Have you considered to explain this in a comment above __bio_clone_fast() to avoid that the next person who reads that function gets confused? Bart.
On Fri, Jun 29, 2018 at 01:00:33PM -0700, Bart Van Assche wrote: > On 06/28/18 17:04, Kent Overstreet wrote: > > On Thu, Jun 28, 2018 at 04:54:44PM -0700, Bart Van Assche wrote: > > > Thanks for chiming in. The linux-block mailing list is archived by multiple > > > websites. The entire e-mail thread is available on e.g. > > > https://www.mail-archive.com/linux-block@vger.kernel.org/msg23006.html. > > > > > > I have a question for you: at least in kernel v4.17 bio_clone_bioset() > > > copies bi_vcnt from the source to the destination bio. However, > > > __bio_clone_fast() doesn't copy bi_vcnt. Isn't that an inconsistency? > > > > No - when you use bio_clone_bioset() you get a bio that you own and can do > > whatever you want with, so it does make sense for it to initialize bi_vcnt. > > > > e.g. you could use bio_clone_bioset() when you're going to be bouncing a bio, > > iterating over each bvec and allocating a new page and copying data from the old > > page to the new page. > > Hello Kent, > > Have you considered to explain this in a comment above __bio_clone_fast() to > avoid that the next person who reads that function gets confused? Well, there is a large comment in bio_clone_bioset() that you must have found...
diff --git a/block/bio.c b/block/bio.c index f7e3d88bd0b6..55f8e0dedd69 100644 --- a/block/bio.c +++ b/block/bio.c @@ -605,6 +605,7 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src) bio->bi_opf = bio_src->bi_opf; bio->bi_write_hint = bio_src->bi_write_hint; bio->bi_iter = bio_src->bi_iter; + bio->bi_vcnt = bio_src->bi_vcnt; bio->bi_io_vec = bio_src->bi_io_vec; bio_clone_blkcg_association(bio, bio_src);
Although __bio_clone_fast() copies bi_io_vec, it does not copy bi_vcnt, the number of elements in bi_io_vec[] that contains data. Copy bi_vcnt such that code that needs this member behaves identically for original and for cloned requests. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> --- block/bio.c | 1 + 1 file changed, 1 insertion(+)