mbox series

[0/3] block: blk_interposer - Block Layer Interposer

Message ID 1607518911-30692-1-git-send-email-sergei.shtepa@veeam.com (mailing list archive)
Headers show
Series block: blk_interposer - Block Layer Interposer | expand

Message

Sergei Shtepa Dec. 9, 2020, 1:01 p.m. UTC
Hi all.

I try to suggest the Block Layer Interposer (blk_interposer) again.
It`s allows to intercept bio requests, remap bio to another devices
or add new bios.

Initially, blk_interposer was designed to be compatible with
device mapper. Our (my and Hannes) previous attempt to offer
blk_interposer integrated with device mapper did not receive
any comments from the dm-devel team, and without their help
I will not be able to make a full implementation. I hope later
they will have time to support blk_interposer in device mapper.

And my blk-snap module requires an architecture change to
support blk_interposer.

This time I offer it along with a sample.
Despite the fact that blk_interposer is quite simple,
there are a few non-obvious points that I would like to clarify.

However, I suggest the blk_interposer now so that people
could discuss it and use it in their projects as soon as possible.

Sergei Shtepa (3):
  block: blk_interposer - Block Layer Interposer
  block: blk_interposer - sample
  block: blk_interposer - sample config

 block/blk-core.c                        |  32 +++
 include/linux/blk_types.h               |   6 +-
 include/linux/genhd.h                   |  12 +-
 samples/Kconfig                         |   7 +
 samples/Makefile                        |   1 +
 samples/blk_interposer/Makefile         |   2 +
 samples/blk_interposer/blk-interposer.c | 276 ++++++++++++++++++++++++
 7 files changed, 333 insertions(+), 3 deletions(-)
 create mode 100644 samples/blk_interposer/Makefile
 create mode 100644 samples/blk_interposer/blk-interposer.c

--
2.20.1

Comments

Mike Snitzer Dec. 9, 2020, 1:51 p.m. UTC | #1
On Wed, Dec 09 2020 at  8:01am -0500,
Sergei Shtepa <sergei.shtepa@veeam.com> wrote:

> Hi all.
> 
> I try to suggest the Block Layer Interposer (blk_interposer) again.
> It`s allows to intercept bio requests, remap bio to another devices
> or add new bios.
> 
> Initially, blk_interposer was designed to be compatible with
> device mapper. Our (my and Hannes) previous attempt to offer
> blk_interposer integrated with device mapper did not receive
> any comments from the dm-devel team, and without their help
> I will not be able to make a full implementation. I hope later
> they will have time to support blk_interposer in device mapper.

Excuse me?  I gave you quite a bit of early feedback!  I then went on
PTO for ~10 days, when I returned last week I had to deal with fixing
some 5.10 dm/block bio splitting regressions that only got resolved this
past Saturday in time for 5.10-rc7.

blk_interposer was/is on my short list to review closer (Hannes' version
that refined yours a bit more).. primarily to see if I could avoid the
extra clone and endio hooking.

The development window for 5.11 is past, so you pushing this without
using the approach discussed (in terms of DM) doesn't help your cause.

> And my blk-snap module requires an architecture change to
> support blk_interposer.
> 
> This time I offer it along with a sample.
> Despite the fact that blk_interposer is quite simple,
> there are a few non-obvious points that I would like to clarify.
> 
> However, I suggest the blk_interposer now so that people
> could discuss it and use it in their projects as soon as possible.

So you weren't willing to put the work in on something DM oriented
because you expected me to do the work for you?  And now you're looking
to side-step the consensus that was built because I didn't contribute
quick enough?  That's pretty messed up.

In the time-scale that is Linux kernel development.. you've really
jumped the gun and undercut my enthusiasm to work through the details.
I'd rather not read into your actions more than I already have here, but
I'm not liking what you've done here.  Kind of left in dismay with how
you decided to go down this path without a followup note to me or others
(that I'm aware of).

But I'm still willing to make blk_interposer work as we all discussed
(in terms of DM).

Mike

> Sergei Shtepa (3):
>   block: blk_interposer - Block Layer Interposer
>   block: blk_interposer - sample
>   block: blk_interposer - sample config
> 
>  block/blk-core.c                        |  32 +++
>  include/linux/blk_types.h               |   6 +-
>  include/linux/genhd.h                   |  12 +-
>  samples/Kconfig                         |   7 +
>  samples/Makefile                        |   1 +
>  samples/blk_interposer/Makefile         |   2 +
>  samples/blk_interposer/blk-interposer.c | 276 ++++++++++++++++++++++++
>  7 files changed, 333 insertions(+), 3 deletions(-)
>  create mode 100644 samples/blk_interposer/Makefile
>  create mode 100644 samples/blk_interposer/blk-interposer.c
> 
> --
> 2.20.1
>
Sergei Shtepa Dec. 10, 2020, 2:58 p.m. UTC | #2
The 12/09/2020 16:51, Mike Snitzer wrote:
> On Wed, Dec 09 2020 at  8:01am -0500,
> Sergei Shtepa <sergei.shtepa@veeam.com> wrote:
> 
> > Hi all.
> > 
> > I try to suggest the Block Layer Interposer (blk_interposer) again.
> > It`s allows to intercept bio requests, remap bio to another devices
> > or add new bios.
> > 
> > Initially, blk_interposer was designed to be compatible with
> > device mapper. Our (my and Hannes) previous attempt to offer
> > blk_interposer integrated with device mapper did not receive
> > any comments from the dm-devel team, and without their help
> > I will not be able to make a full implementation. I hope later
> > they will have time to support blk_interposer in device mapper.
> 
> Excuse me?  I gave you quite a bit of early feedback!  I then went on
> PTO for ~10 days, when I returned last week I had to deal with fixing
> some 5.10 dm/block bio splitting regressions that only got resolved this
> past Saturday in time for 5.10-rc7.

Mike,

I would like to clarify some points that I've made, and also try 
to repair the damage from the misunderstandings that I think have occured.

First of all, I actually meant the feedback on Hannes's patch which was
sent on the 19th of November:
https://lore.kernel.org/linux-block/20201119164924.74401-1-hare@suse.de/

Your feedback from the 18th of November ("[PATCH 4/4] dm_interposer - 
Try to use blk_interpose in dm") is very valuable, but I am not sure that
I am currently capable of implementing the proposed DM changes.
The overall architecture of DM is still not clear to me, and I am studying
it right now.

This new patch (the one that Hannes sent on the 19th of November) is also
compatibile with DM and should not pose any problems - the architecture is
the same. There are some changes that make blk_interposer simpler and better,
plus the sample is added.

> 
> blk_interposer was/is on my short list to review closer (Hannes' version
> that refined yours a bit more).. primarily to see if I could avoid the
> extra clone and endio hooking.

Glad to hear that! In order to avoid the additional copying one can only
change an intercepted bio, which might be dangerous.

> 
> The development window for 5.11 is past, so you pushing this without
> using the approach discussed (in terms of DM) doesn't help your cause.
> 
> > And my blk-snap module requires an architecture change to
> > support blk_interposer.
> > 
> > This time I offer it along with a sample.
> > Despite the fact that blk_interposer is quite simple,
> > there are a few non-obvious points that I would like to clarify.
> > 
> > However, I suggest the blk_interposer now so that people
> > could discuss it and use it in their projects as soon as possible.
> 
> So you weren't willing to put the work in on something DM oriented
> because you expected me to do the work for you?  And now you're looking
> to side-step the consensus that was built because I didn't contribute
> quick enough?  That's pretty messed up.

I just think that no one can implement integration of DM with
blk_interposer better than dm-devel team. I will certainly try my best,
but I am afraid that such efforts will only produce incongruous
DM patches that will be a waste of time (yours and mine).

> 
> In the time-scale that is Linux kernel development.. you've really
> jumped the gun and undercut my enthusiasm to work through the details.
> I'd rather not read into your actions more than I already have here, but
> I'm not liking what you've done here.  Kind of left in dismay with how
> you decided to go down this path without a followup note to me or others
> (that I'm aware of).

I am very sorry that I undercut your enthusiasm, but, as you rightly
pointed out, another development windows is closing, and my product
is still not able to work on newer Linux versions starting from 5.8.
That fact makes me particularly sad and forces me to search for different
means to draw some attention to blk_interposer. I've seen an RHEL 8.4
alpha recently, all looks very cool there but made me even more sad ...

Also I certainly remember a separate email that was sent to you on the
7th of December. Maybe it should have been written in the already
existing thread instead.

> 
> But I'm still willing to make blk_interposer work as we all discussed
> (in terms of DM).

I want it too. However, there is a certain difficulty with usage of DM
for backup copying. For instance, there is no changed block tracking (CBT)
and right now I don't have any clue how it could be implemented
considering the existing architecture. I still hope that sometime
in future I could be offer my blk-snap module which was specifically
created for backup copying purposes.

I apologize for causing all that confusion and mess and making you upset.
I hope that all of the above makes sense to you and you will not think
that I was slacking and tried to offload all the work to your team.

> 
> Mike
> 
> > Sergei Shtepa (3):
> >   block: blk_interposer - Block Layer Interposer
> >   block: blk_interposer - sample
> >   block: blk_interposer - sample config
> > 
> >  block/blk-core.c                        |  32 +++
> >  include/linux/blk_types.h               |   6 +-
> >  include/linux/genhd.h                   |  12 +-
> >  samples/Kconfig                         |   7 +
> >  samples/Makefile                        |   1 +
> >  samples/blk_interposer/Makefile         |   2 +
> >  samples/blk_interposer/blk-interposer.c | 276 ++++++++++++++++++++++++
> >  7 files changed, 333 insertions(+), 3 deletions(-)
> >  create mode 100644 samples/blk_interposer/Makefile
> >  create mode 100644 samples/blk_interposer/blk-interposer.c
> > 
> > --
> > 2.20.1
> > 
>
Mike Snitzer Dec. 10, 2020, 4:32 p.m. UTC | #3
On Thu, Dec 10 2020 at  9:58am -0500,
Sergei Shtepa <sergei.shtepa@veeam.com> wrote:

> The 12/09/2020 16:51, Mike Snitzer wrote:
> > On Wed, Dec 09 2020 at  8:01am -0500,
> > Sergei Shtepa <sergei.shtepa@veeam.com> wrote:
> > 
> > > Hi all.
> > > 
> > > I try to suggest the Block Layer Interposer (blk_interposer) again.
> > > It`s allows to intercept bio requests, remap bio to another devices
> > > or add new bios.
> > > 
> > > Initially, blk_interposer was designed to be compatible with
> > > device mapper. Our (my and Hannes) previous attempt to offer
> > > blk_interposer integrated with device mapper did not receive
> > > any comments from the dm-devel team, and without their help
> > > I will not be able to make a full implementation. I hope later
> > > they will have time to support blk_interposer in device mapper.
> > 
> > Excuse me?  I gave you quite a bit of early feedback!  I then went on
> > PTO for ~10 days, when I returned last week I had to deal with fixing
> > some 5.10 dm/block bio splitting regressions that only got resolved this
> > past Saturday in time for 5.10-rc7.
> 
> Mike,
> 
> I would like to clarify some points that I've made, and also try 
> to repair the damage from the misunderstandings that I think have occured.
> 
> First of all, I actually meant the feedback on Hannes's patch which was
> sent on the 19th of November:
> https://lore.kernel.org/linux-block/20201119164924.74401-1-hare@suse.de/
> 
> Your feedback from the 18th of November ("[PATCH 4/4] dm_interposer - 
> Try to use blk_interpose in dm") is very valuable, but I am not sure that
> I am currently capable of implementing the proposed DM changes.
> The overall architecture of DM is still not clear to me, and I am studying
> it right now.
> 
> This new patch (the one that Hannes sent on the 19th of November) is also
> compatibile with DM and should not pose any problems - the architecture is
> the same. There are some changes that make blk_interposer simpler and better,
> plus the sample is added.
> 
> > 
> > blk_interposer was/is on my short list to review closer (Hannes' version
> > that refined yours a bit more).. primarily to see if I could avoid the
> > extra clone and endio hooking.
> 
> Glad to hear that! In order to avoid the additional copying one can only
> change an intercepted bio, which might be dangerous.
> 
> > 
> > The development window for 5.11 is past, so you pushing this without
> > using the approach discussed (in terms of DM) doesn't help your cause.
> > 
> > > And my blk-snap module requires an architecture change to
> > > support blk_interposer.
> > > 
> > > This time I offer it along with a sample.
> > > Despite the fact that blk_interposer is quite simple,
> > > there are a few non-obvious points that I would like to clarify.
> > > 
> > > However, I suggest the blk_interposer now so that people
> > > could discuss it and use it in their projects as soon as possible.
> > 
> > So you weren't willing to put the work in on something DM oriented
> > because you expected me to do the work for you?  And now you're looking
> > to side-step the consensus that was built because I didn't contribute
> > quick enough?  That's pretty messed up.
> 
> I just think that no one can implement integration of DM with
> blk_interposer better than dm-devel team. I will certainly try my best,
> but I am afraid that such efforts will only produce incongruous
> DM patches that will be a waste of time (yours and mine).
> 
> > 
> > In the time-scale that is Linux kernel development.. you've really
> > jumped the gun and undercut my enthusiasm to work through the details.
> > I'd rather not read into your actions more than I already have here, but
> > I'm not liking what you've done here.  Kind of left in dismay with how
> > you decided to go down this path without a followup note to me or others
> > (that I'm aware of).
> 
> I am very sorry that I undercut your enthusiasm, but, as you rightly
> pointed out, another development windows is closing, and my product
> is still not able to work on newer Linux versions starting from 5.8.
> That fact makes me particularly sad and forces me to search for different
> means to draw some attention to blk_interposer. I've seen an RHEL 8.4
> alpha recently, all looks very cool there but made me even more sad ...

Made you more sad because you don't have a working solution for upstream
or RHEL 8.4?

I may have missed it in your past emails but how were you able to
provide blk-snap support for kernels prior to 5.8?

> Also I certainly remember a separate email that was sent to you on the
> 7th of December. Maybe it should have been written in the already
> existing thread instead.

I don't see any mail from you on Dec 7... please bounce it to me if it
was in private, or please point to the relevant list archive.

> > 
> > But I'm still willing to make blk_interposer work as we all discussed
> > (in terms of DM).
> 
> I want it too. However, there is a certain difficulty with usage of DM
> for backup copying. For instance, there is no changed block tracking (CBT)
> and right now I don't have any clue how it could be implemented
> considering the existing architecture. I still hope that sometime
> in future I could be offer my blk-snap module which was specifically
> created for backup copying purposes.
> 
> I apologize for causing all that confusion and mess and making you upset.
> I hope that all of the above makes sense to you and you will not think
> that I was slacking and tried to offload all the work to your team.

My primary concern is that blk_interposer be correct from the start.  To
validate its correctness it needs to be fully implemented and vetted in
terms on upstream Linux kernel code.  DM can easily serve as the primary
upstream consumer until if/when your blk-snap module is proposed for
upstream inclusion.

But short of having an actual upstream consumer of blk_interposer (not
just samples/ code) it cannot go upstream.  Otherwise there are too many
risks of misuse and problems in the longrun.  That and it'd be baggage
block core would need to carry for no upstream Linux benefit.

Mike
Mike Snitzer Dec. 11, 2020, 4:30 p.m. UTC | #4
On Thu, Dec 10 2020 at 11:32am -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Thu, Dec 10 2020 at  9:58am -0500,
> Sergei Shtepa <sergei.shtepa@veeam.com> wrote:
> 
> > The 12/09/2020 16:51, Mike Snitzer wrote:
> > > On Wed, Dec 09 2020 at  8:01am -0500,
> > > Sergei Shtepa <sergei.shtepa@veeam.com> wrote:
> > > 
> > > > Hi all.
> > > > 
> > > > I try to suggest the Block Layer Interposer (blk_interposer) again.
> > > > It`s allows to intercept bio requests, remap bio to another devices
> > > > or add new bios.
> > > > 
> > > > Initially, blk_interposer was designed to be compatible with
> > > > device mapper. Our (my and Hannes) previous attempt to offer
> > > > blk_interposer integrated with device mapper did not receive
> > > > any comments from the dm-devel team, and without their help
> > > > I will not be able to make a full implementation. I hope later
> > > > they will have time to support blk_interposer in device mapper.
> > > 
> > > Excuse me?  I gave you quite a bit of early feedback!  I then went on
> > > PTO for ~10 days, when I returned last week I had to deal with fixing
> > > some 5.10 dm/block bio splitting regressions that only got resolved this
> > > past Saturday in time for 5.10-rc7.
> > 
> > Mike,
> > 
> > I would like to clarify some points that I've made, and also try 
> > to repair the damage from the misunderstandings that I think have occured.
> > 
> > First of all, I actually meant the feedback on Hannes's patch which was
> > sent on the 19th of November:
> > https://lore.kernel.org/linux-block/20201119164924.74401-1-hare@suse.de/
> > 
> > Your feedback from the 18th of November ("[PATCH 4/4] dm_interposer - 
> > Try to use blk_interpose in dm") is very valuable, but I am not sure that
> > I am currently capable of implementing the proposed DM changes.
> > The overall architecture of DM is still not clear to me, and I am studying
> > it right now.
> > 
> > This new patch (the one that Hannes sent on the 19th of November) is also
> > compatibile with DM and should not pose any problems - the architecture is
> > the same. There are some changes that make blk_interposer simpler and better,
> > plus the sample is added.
> > 
> > > 
> > > blk_interposer was/is on my short list to review closer (Hannes' version
> > > that refined yours a bit more).. primarily to see if I could avoid the
> > > extra clone and endio hooking.
> > 
> > Glad to hear that! In order to avoid the additional copying one can only
> > change an intercepted bio, which might be dangerous.
> > 
> > > 
> > > The development window for 5.11 is past, so you pushing this without
> > > using the approach discussed (in terms of DM) doesn't help your cause.
> > > 
> > > > And my blk-snap module requires an architecture change to
> > > > support blk_interposer.
> > > > 
> > > > This time I offer it along with a sample.
> > > > Despite the fact that blk_interposer is quite simple,
> > > > there are a few non-obvious points that I would like to clarify.
> > > > 
> > > > However, I suggest the blk_interposer now so that people
> > > > could discuss it and use it in their projects as soon as possible.
> > > 
> > > So you weren't willing to put the work in on something DM oriented
> > > because you expected me to do the work for you?  And now you're looking
> > > to side-step the consensus that was built because I didn't contribute
> > > quick enough?  That's pretty messed up.
> > 
> > I just think that no one can implement integration of DM with
> > blk_interposer better than dm-devel team. I will certainly try my best,
> > but I am afraid that such efforts will only produce incongruous
> > DM patches that will be a waste of time (yours and mine).
> > 
> > > 
> > > In the time-scale that is Linux kernel development.. you've really
> > > jumped the gun and undercut my enthusiasm to work through the details.
> > > I'd rather not read into your actions more than I already have here, but
> > > I'm not liking what you've done here.  Kind of left in dismay with how
> > > you decided to go down this path without a followup note to me or others
> > > (that I'm aware of).
> > 
> > I am very sorry that I undercut your enthusiasm, but, as you rightly
> > pointed out, another development windows is closing, and my product
> > is still not able to work on newer Linux versions starting from 5.8.
> > That fact makes me particularly sad and forces me to search for different
> > means to draw some attention to blk_interposer. I've seen an RHEL 8.4
> > alpha recently, all looks very cool there but made me even more sad ...
> 
> Made you more sad because you don't have a working solution for upstream
> or RHEL 8.4?
> 
> I may have missed it in your past emails but how were you able to
> provide blk-snap support for kernels prior to 5.8?

I now clearly understand that the 5.8 block changes to do away with
->make_request_fn in favor of a more optimized ->submit_bio (that isn't
applicable for all devices) is why you're pursuing a "fix" so urgently.

> > > But I'm still willing to make blk_interposer work as we all discussed
> > > (in terms of DM).
> > 
> > I want it too. However, there is a certain difficulty with usage of DM
> > for backup copying. For instance, there is no changed block tracking (CBT)
> > and right now I don't have any clue how it could be implemented
> > considering the existing architecture. I still hope that sometime
> > in future I could be offer my blk-snap module which was specifically
> > created for backup copying purposes.
> > 
> > I apologize for causing all that confusion and mess and making you upset.
> > I hope that all of the above makes sense to you and you will not think
> > that I was slacking and tried to offload all the work to your team.
> 
> My primary concern is that blk_interposer be correct from the start.  To
> validate its correctness it needs to be fully implemented and vetted in
> terms on upstream Linux kernel code.  DM can easily serve as the primary
> upstream consumer until if/when your blk-snap module is proposed for
> upstream inclusion.
> 
> But short of having an actual upstream consumer of blk_interposer (not
> just samples/ code) it cannot go upstream.  Otherwise there are too many
> risks of misuse and problems in the longrun.  That and it'd be baggage
> block core would need to carry for no upstream Linux benefit.

As I shared in private: I have some urgent Red Hat business critical
work I need to do and unfortunately cannot put my near-term focus to
implementing a fully baked blk_interposer for DM.  But I can come back
to it (sadly unlikely to do so until the new year though).

While I still think there needs to be a proper _upstream_ consumer of
blk_interposer as a condition of it going in.. I'll let others make the
call.

As such, I'll defer to Jens, Christoph and others on whether your
minimalist blk_interposer hook is acceptable in the near-term.

Mike
Jens Axboe Dec. 11, 2020, 4:33 p.m. UTC | #5
On 12/11/20 9:30 AM, Mike Snitzer wrote:
> While I still think there needs to be a proper _upstream_ consumer of
> blk_interposer as a condition of it going in.. I'll let others make the
> call.

That's an unequivocal rule.

> As such, I'll defer to Jens, Christoph and others on whether your
> minimalist blk_interposer hook is acceptable in the near-term.

I don't think so, we don't do short term bandaids just to plan on
ripping that out when the real functionality is there. IMHO, the dm
approach is the way to go - it provides exactly the functionality that
is needed in an appropriate way, instead of hacking some "interposer"
into the core block layer.
Hannes Reinecke Dec. 11, 2020, 4:56 p.m. UTC | #6
On 12/11/20 5:33 PM, Jens Axboe wrote:
> On 12/11/20 9:30 AM, Mike Snitzer wrote:
>> While I still think there needs to be a proper _upstream_ consumer of
>> blk_interposer as a condition of it going in.. I'll let others make the
>> call.
> 
> That's an unequivocal rule.
> 
>> As such, I'll defer to Jens, Christoph and others on whether your
>> minimalist blk_interposer hook is acceptable in the near-term.
> 
> I don't think so, we don't do short term bandaids just to plan on
> ripping that out when the real functionality is there. IMHO, the dm
> approach is the way to go - it provides exactly the functionality that
> is needed in an appropriate way, instead of hacking some "interposer"
> into the core block layer.
> 
Which is my plan, too.

I'll be working with the Veeam folks to present a joint patchset 
(including the DM bits) for the next round.

Cheers,

Hannes
Jens Axboe Dec. 11, 2020, 5:04 p.m. UTC | #7
On 12/11/20 9:56 AM, Hannes Reinecke wrote:
> On 12/11/20 5:33 PM, Jens Axboe wrote:
>> On 12/11/20 9:30 AM, Mike Snitzer wrote:
>>> While I still think there needs to be a proper _upstream_ consumer of
>>> blk_interposer as a condition of it going in.. I'll let others make the
>>> call.
>>
>> That's an unequivocal rule.
>>
>>> As such, I'll defer to Jens, Christoph and others on whether your
>>> minimalist blk_interposer hook is acceptable in the near-term.
>>
>> I don't think so, we don't do short term bandaids just to plan on
>> ripping that out when the real functionality is there. IMHO, the dm
>> approach is the way to go - it provides exactly the functionality that
>> is needed in an appropriate way, instead of hacking some "interposer"
>> into the core block layer.
>>
> Which is my plan, too.
> 
> I'll be working with the Veeam folks to present a joint patchset 
> (including the DM bits) for the next round.

Just to be clear, core block additions for something that dm will
consume is obviously fine. Adding this as block layer functionality than
then exposes an application API for setting it up, tearing down, etc -
that is definitely NOT
Hannes Reinecke Dec. 11, 2020, 6:03 p.m. UTC | #8
On 12/11/20 6:04 PM, Jens Axboe wrote:
> On 12/11/20 9:56 AM, Hannes Reinecke wrote:
>> On 12/11/20 5:33 PM, Jens Axboe wrote:
>>> On 12/11/20 9:30 AM, Mike Snitzer wrote:
>>>> While I still think there needs to be a proper _upstream_ consumer of
>>>> blk_interposer as a condition of it going in.. I'll let others make the
>>>> call.
>>>
>>> That's an unequivocal rule.
>>>
>>>> As such, I'll defer to Jens, Christoph and others on whether your
>>>> minimalist blk_interposer hook is acceptable in the near-term.
>>>
>>> I don't think so, we don't do short term bandaids just to plan on
>>> ripping that out when the real functionality is there. IMHO, the dm
>>> approach is the way to go - it provides exactly the functionality that
>>> is needed in an appropriate way, instead of hacking some "interposer"
>>> into the core block layer.
>>>
>> Which is my plan, too.
>>
>> I'll be working with the Veeam folks to present a joint patchset
>> (including the DM bits) for the next round.
> 
> Just to be clear, core block additions for something that dm will
> consume is obviously fine. Adding this as block layer functionality than
> then exposes an application API for setting it up, tearing down, etc -
> that is definitely NOT
> 
That was never my intention.
Any consumers of this thing would need to be in-kernel.
If that was your concern.

Cheers,

Hannes
Jens Axboe Dec. 12, 2020, 6:14 p.m. UTC | #9
On 12/11/20 11:03 AM, Hannes Reinecke wrote:
> On 12/11/20 6:04 PM, Jens Axboe wrote:
>> On 12/11/20 9:56 AM, Hannes Reinecke wrote:
>>> On 12/11/20 5:33 PM, Jens Axboe wrote:
>>>> On 12/11/20 9:30 AM, Mike Snitzer wrote:
>>>>> While I still think there needs to be a proper _upstream_ consumer of
>>>>> blk_interposer as a condition of it going in.. I'll let others make the
>>>>> call.
>>>>
>>>> That's an unequivocal rule.
>>>>
>>>>> As such, I'll defer to Jens, Christoph and others on whether your
>>>>> minimalist blk_interposer hook is acceptable in the near-term.
>>>>
>>>> I don't think so, we don't do short term bandaids just to plan on
>>>> ripping that out when the real functionality is there. IMHO, the dm
>>>> approach is the way to go - it provides exactly the functionality that
>>>> is needed in an appropriate way, instead of hacking some "interposer"
>>>> into the core block layer.
>>>>
>>> Which is my plan, too.
>>>
>>> I'll be working with the Veeam folks to present a joint patchset
>>> (including the DM bits) for the next round.
>>
>> Just to be clear, core block additions for something that dm will
>> consume is obviously fine. Adding this as block layer functionality than
>> then exposes an application API for setting it up, tearing down, etc -
>> that is definitely NOT
>>
> That was never my intention.
> Any consumers of this thing would need to be in-kernel.
> If that was your concern.

Yep, that is/was indeed my concern!
Bob Liu Dec. 15, 2020, 6:51 a.m. UTC | #10
Hi Folks,

On 12/12/20 12:56 AM, Hannes Reinecke wrote:
> On 12/11/20 5:33 PM, Jens Axboe wrote:
>> On 12/11/20 9:30 AM, Mike Snitzer wrote:
>>> While I still think there needs to be a proper _upstream_ consumer of
>>> blk_interposer as a condition of it going in.. I'll let others make the
>>> call.
>>
>> That's an unequivocal rule.
>>
>>> As such, I'll defer to Jens, Christoph and others on whether your
>>> minimalist blk_interposer hook is acceptable in the near-term.
>>
>> I don't think so, we don't do short term bandaids just to plan on
>> ripping that out when the real functionality is there. IMHO, the dm
>> approach is the way to go - it provides exactly the functionality that
>> is needed in an appropriate way, instead of hacking some "interposer"
>> into the core block layer.
>>
> Which is my plan, too.
> 
> I'll be working with the Veeam folks to present a joint patchset (including the DM bits) for the next round.
> 

Besides the dm approach, do you think Veeam's original requirement is a good
use case of "block/bpf: add eBPF based block layer IO filtering"?
https://lwn.net/ml/bpf/20200812163305.545447-1-leah.rumancik@gmail.com/

Thanks,
Bob
Hannes Reinecke Dec. 15, 2020, 7:41 a.m. UTC | #11
On 12/15/20 7:51 AM, Bob Liu wrote:
> Hi Folks,
> 
> On 12/12/20 12:56 AM, Hannes Reinecke wrote:
>> On 12/11/20 5:33 PM, Jens Axboe wrote:
>>> On 12/11/20 9:30 AM, Mike Snitzer wrote:
>>>> While I still think there needs to be a proper _upstream_ consumer of
>>>> blk_interposer as a condition of it going in.. I'll let others make the
>>>> call.
>>>
>>> That's an unequivocal rule.
>>>
>>>> As such, I'll defer to Jens, Christoph and others on whether your
>>>> minimalist blk_interposer hook is acceptable in the near-term.
>>>
>>> I don't think so, we don't do short term bandaids just to plan on
>>> ripping that out when the real functionality is there. IMHO, the dm
>>> approach is the way to go - it provides exactly the functionality that
>>> is needed in an appropriate way, instead of hacking some "interposer"
>>> into the core block layer.
>>>
>> Which is my plan, too.
>>
>> I'll be working with the Veeam folks to present a joint patchset (including the DM bits) for the next round.
>>
> 
> Besides the dm approach, do you think Veeam's original requirement is a good
> use case of "block/bpf: add eBPF based block layer IO filtering"?
> https://lwn.net/ml/bpf/20200812163305.545447-1-leah.rumancik@gmail.com/
> 
That would actually a really cool use-case.
You could also consider a XDP-like functionality for eBPF, to move 
individual requests from one queue to the other; DM on steroids :-)

Should I try to include that patchset?

Cheers,

Hannes
Sergei Shtepa Dec. 15, 2020, 3:57 p.m. UTC | #12
The 12/15/2020 09:51, Bob Liu wrote:
> Hi Folks,
> 
> On 12/12/20 12:56 AM, Hannes Reinecke wrote:
> > On 12/11/20 5:33 PM, Jens Axboe wrote:
> >> On 12/11/20 9:30 AM, Mike Snitzer wrote:
> >>> While I still think there needs to be a proper _upstream_ consumer of
> >>> blk_interposer as a condition of it going in.. I'll let others make the
> >>> call.
> >>
> >> That's an unequivocal rule.
> >>
> >>> As such, I'll defer to Jens, Christoph and others on whether your
> >>> minimalist blk_interposer hook is acceptable in the near-term.
> >>
> >> I don't think so, we don't do short term bandaids just to plan on
> >> ripping that out when the real functionality is there. IMHO, the dm
> >> approach is the way to go - it provides exactly the functionality that
> >> is needed in an appropriate way, instead of hacking some "interposer"
> >> into the core block layer.
> >>
> > Which is my plan, too.
> > 
> > I'll be working with the Veeam folks to present a joint patchset (including the DM bits) for the next round.
> > 
> 
> Besides the dm approach, do you think Veeam's original requirement is a good
> use case of "block/bpf: add eBPF based block layer IO filtering"?
> https://lwn.net/ml/bpf/20200812163305.545447-1-leah.rumancik@gmail.com/
> 
> Thanks,
> Bob

Hi Bob.
Thank you for your message.

I looked at your patch - it's interesting, but I have a few comments.

1.
#ifdef CONFIG_BPF_IO_FILTER
	struct bpf_prog_array __rcu *progs;
	struct mutex io_filter_lock;
#endif
For DM and blk-snap to work, it is necessary that there is always
the possibility of interception.

2.
We could get rid of the io_filter_lock lock if we do attach/detach while
the queue is stopped. And __rcu for *progs, can not use either.

3.
int io_filter_bpf_run(struct bio *bio)
{
	struct bpf_io_request io_req = {
		.sector_start = bio->bi_iter.bi_sector,
		.sector_cnt = bio_sectors(bio),
		.opf = bio->bi_opf,
	};
	return BPF_PROG_RUN_ARRAY_CHECK(bio->bi_disk->progs, &io_req, BPF_PROG_RUN);
}
Allows to get little information. It will not allow to work with the bios`s data.
blk_interposer allows to get full access to bio.
For use in the DM, we must also be able to add new bio's.

Summary:
For device-mapper purposes, bpf_io_filter is not suitable.
bpf_io_filter in this form is probably convenient to use for monitoring and
studying the block layer.
For the security task, I would suggest making a separate module and using
blk_interposer to intercept bio requests. This will give more flexible
functionality and better performance.

--
Sergei Shtepa
Veeam Software developer.