Message ID | 20180830185352.3369-8-logang@deltatee.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Copy Offload in NVMe Fabrics with P2P PCI Memory | expand |
On 8/30/18 12:53 PM, Logan Gunthorpe wrote: > QUEUE_FLAG_PCI_P2P is introduced meaning a driver's request queue > supports targeting P2P memory. > > When a request is submitted we check if PCI P2PDMA memory is assigned > to the first page in the bio. If it is, we ensure the queue it's > submitted to supports it, and enforce REQ_NOMERGE. I think this belongs in the caller - both the validity check, and passing in NOMERGE for this type of request. I don't want to impose this overhead on everything, for a pretty niche case.
On 30/08/18 01:11 PM, Jens Axboe wrote: > On 8/30/18 12:53 PM, Logan Gunthorpe wrote: >> QUEUE_FLAG_PCI_P2P is introduced meaning a driver's request queue >> supports targeting P2P memory. >> >> When a request is submitted we check if PCI P2PDMA memory is assigned >> to the first page in the bio. If it is, we ensure the queue it's >> submitted to supports it, and enforce REQ_NOMERGE. > > I think this belongs in the caller - both the validity check, and > passing in NOMERGE for this type of request. I don't want to impose > this overhead on everything, for a pretty niche case. Well, the point was to prevent driver writers from doing the wrong thing. The WARN_ON would be a bit pointless in the driver if we rely on the driver to either do the right thing or add the WARN_ON themselves. If I'm going to change anything I'd drop the warning entirely and move the NO_MERGE back into the caller... Note: the check will be compiled out if the kernel does not support PCI P2P. Logan
On 8/30/18 1:17 PM, Logan Gunthorpe wrote: > > > On 30/08/18 01:11 PM, Jens Axboe wrote: >> On 8/30/18 12:53 PM, Logan Gunthorpe wrote: >>> QUEUE_FLAG_PCI_P2P is introduced meaning a driver's request queue >>> supports targeting P2P memory. >>> >>> When a request is submitted we check if PCI P2PDMA memory is assigned >>> to the first page in the bio. If it is, we ensure the queue it's >>> submitted to supports it, and enforce REQ_NOMERGE. >> >> I think this belongs in the caller - both the validity check, and >> passing in NOMERGE for this type of request. I don't want to impose >> this overhead on everything, for a pretty niche case. > > Well, the point was to prevent driver writers from doing the wrong > thing. The WARN_ON would be a bit pointless in the driver if we rely on > the driver to either do the right thing or add the WARN_ON themselves. > > If I'm going to change anything I'd drop the warning entirely and move > the NO_MERGE back into the caller... Of course, if you move it into the caller, the warning makes no sense. > Note: the check will be compiled out if the kernel does not support PCI P2P. Sure, but then distros tend to enable everything...
On Thu, Aug 30, 2018 at 01:11:18PM -0600, Jens Axboe wrote: > I think this belongs in the caller - both the validity check, and > passing in NOMERGE for this type of request. I don't want to impose > this overhead on everything, for a pretty niche case. It is just a single branch, which will be predicted as not taken for non-P2P users. The benefit is that we get proper error checking by doing it in the block code.
On 01/09/18 02:28 AM, Christoph Hellwig wrote: > On Thu, Aug 30, 2018 at 01:11:18PM -0600, Jens Axboe wrote: >> I think this belongs in the caller - both the validity check, and >> passing in NOMERGE for this type of request. I don't want to impose >> this overhead on everything, for a pretty niche case. > > It is just a single branch, which will be predicted as not taken > for non-P2P users. The benefit is that we get proper error checking > by doing it in the block code. I personally agree with Christoph. But if there's consensus in the other direction or this is a real blocker moving this forward, I can remove it for the next version. Logan
On 9/3/18 4:26 PM, Logan Gunthorpe wrote: > > > On 01/09/18 02:28 AM, Christoph Hellwig wrote: >> On Thu, Aug 30, 2018 at 01:11:18PM -0600, Jens Axboe wrote: >>> I think this belongs in the caller - both the validity check, and >>> passing in NOMERGE for this type of request. I don't want to impose >>> this overhead on everything, for a pretty niche case. >> >> It is just a single branch, which will be predicted as not taken >> for non-P2P users. The benefit is that we get proper error checking >> by doing it in the block code. > > I personally agree with Christoph. But if there's consensus in the other > direction or this is a real blocker moving this forward, I can remove it > for the next version. It's a simple branch because the check isn't exhaustive. It just checks the first page. At that point you may as well just require the caller to flag the bio/rq as being P2P, and then do a check for P2P compatibility with the queue.
On 05/09/18 01:26 PM, Jens Axboe wrote: > On 9/3/18 4:26 PM, Logan Gunthorpe wrote: >> I personally agree with Christoph. But if there's consensus in the other >> direction or this is a real blocker moving this forward, I can remove it >> for the next version. > > It's a simple branch because the check isn't exhaustive. It just checks > the first page. At that point you may as well just require the caller to > flag the bio/rq as being P2P, and then do a check for P2P compatibility > with the queue. Hmm, we had something like that in v4[1] but it just seemed redundant to create a flag when the information was already in the bio and kind of ugly for the caller to check for, then set, the flag. I'm not _that_ averse to going back to that though... Logan [1] https://lore.kernel.org/lkml/20180423233046.21476-8-logang@deltatee.com/T/#u
On 9/5/18 1:33 PM, Logan Gunthorpe wrote: > > > On 05/09/18 01:26 PM, Jens Axboe wrote: >> On 9/3/18 4:26 PM, Logan Gunthorpe wrote: >>> I personally agree with Christoph. But if there's consensus in the other >>> direction or this is a real blocker moving this forward, I can remove it >>> for the next version. >> >> It's a simple branch because the check isn't exhaustive. It just checks >> the first page. At that point you may as well just require the caller to >> flag the bio/rq as being P2P, and then do a check for P2P compatibility >> with the queue. > > Hmm, we had something like that in v4[1] but it just seemed redundant to > create a flag when the information was already in the bio and kind of > ugly for the caller to check for, then set, the flag. I'm not _that_ > averse to going back to that though... The point is that the caller doesn't necessarily know where the bio will end up, hence the caller can't fully check if the whole stack supports P2P. What happens if a P2P request ends up with a driver that doesn't support it?
On 05/09/18 01:45 PM, Jens Axboe wrote: > The point is that the caller doesn't necessarily know where the bio > will end up, hence the caller can't fully check if the whole stack > supports P2P. > > What happens if a P2P request ends up with a driver that doesn't > support it? Yes, that's the whole point this check. Although we expect the caller to do other checks before submitting a P2P request to a queue, so if a driver does submit a P2P request to an unsupported queue, it is definitely a problem in the driver (which is why we want to WARN). Queues that support P2P (only PCI NVMe at this time, see patch 10) must set QUEUE_FLAG_PCI_P2PDMA to indicate it. The check we are adding in blk-core is meant to ensure any broken drivers that submit requests with P2P memory do not get sent to a queue that doesn't indicate support. On top of that, the code in NVMe target ensures that all namespaces on a port are backed by queues that support P2P and, if not, it never allocates any P2P SGLs. Logan
On 9/5/18 1:56 PM, Christoph Hellwig wrote: > On Wed, Sep 05, 2018 at 01:45:04PM -0600, Jens Axboe wrote: >> The point is that the caller doesn't necessarily know where the bio >> will end up, hence the caller can't fully check if the whole stack >> supports P2P. > > The caller must necessarily know where the bio will end up, as for P2P > support we need to query if the bio target is P2P capable vs the > source of the P2P memory. Then what's the point of having the check at all?
On Wed, Sep 05, 2018 at 01:45:04PM -0600, Jens Axboe wrote: > The point is that the caller doesn't necessarily know where the bio > will end up, hence the caller can't fully check if the whole stack > supports P2P. The caller must necessarily know where the bio will end up, as for P2P support we need to query if the bio target is P2P capable vs the source of the P2P memory.
On 05/09/18 02:11 PM, Christoph Hellwig wrote: > On Wed, Sep 05, 2018 at 01:54:31PM -0600, Jens Axboe wrote: >> On 9/5/18 1:56 PM, Christoph Hellwig wrote: >>> On Wed, Sep 05, 2018 at 01:45:04PM -0600, Jens Axboe wrote: >>>> The point is that the caller doesn't necessarily know where the bio >>>> will end up, hence the caller can't fully check if the whole stack >>>> supports P2P. >>> >>> The caller must necessarily know where the bio will end up, as for P2P >>> support we need to query if the bio target is P2P capable vs the >>> source of the P2P memory. >> >> Then what's the point of having the check at all? > > Just an additional little safe guard. If you think it isn't worth > it I guess we can just drop it for now. Yes, the point is to prevent driver writers from doing the wrong thing by not doing the necessary checks before submitting to the queue. Logan
On Wed, Sep 05, 2018 at 01:54:31PM -0600, Jens Axboe wrote: > On 9/5/18 1:56 PM, Christoph Hellwig wrote: > > On Wed, Sep 05, 2018 at 01:45:04PM -0600, Jens Axboe wrote: > >> The point is that the caller doesn't necessarily know where the bio > >> will end up, hence the caller can't fully check if the whole stack > >> supports P2P. > > > > The caller must necessarily know where the bio will end up, as for P2P > > support we need to query if the bio target is P2P capable vs the > > source of the P2P memory. > > Then what's the point of having the check at all? Just an additional little safe guard. If you think it isn't worth it I guess we can just drop it for now.
On 9/5/18 2:09 PM, Logan Gunthorpe wrote: > > > On 05/09/18 02:11 PM, Christoph Hellwig wrote: >> On Wed, Sep 05, 2018 at 01:54:31PM -0600, Jens Axboe wrote: >>> On 9/5/18 1:56 PM, Christoph Hellwig wrote: >>>> On Wed, Sep 05, 2018 at 01:45:04PM -0600, Jens Axboe wrote: >>>>> The point is that the caller doesn't necessarily know where the bio >>>>> will end up, hence the caller can't fully check if the whole stack >>>>> supports P2P. >>>> >>>> The caller must necessarily know where the bio will end up, as for P2P >>>> support we need to query if the bio target is P2P capable vs the >>>> source of the P2P memory. >>> >>> Then what's the point of having the check at all? >> >> Just an additional little safe guard. If you think it isn't worth >> it I guess we can just drop it for now. > > Yes, the point is to prevent driver writers from doing the wrong thing > by not doing the necessary checks before submitting to the queue. But if the caller must absolutely know where the bio will end up, then it seems super redundant. So I'd vote for killing this check, it buys us absolutely nothing and isn't even exhaustive in its current form.
On 05/09/18 02:14 PM, Jens Axboe wrote: > But if the caller must absolutely know where the bio will end up, then > it seems super redundant. So I'd vote for killing this check, it buys > us absolutely nothing and isn't even exhaustive in its current form. Ok, I'll remove it for v6. Logan
On 9/5/18 2:18 PM, Logan Gunthorpe wrote: > > > On 05/09/18 02:14 PM, Jens Axboe wrote: >> But if the caller must absolutely know where the bio will end up, then >> it seems super redundant. So I'd vote for killing this check, it buys >> us absolutely nothing and isn't even exhaustive in its current form. > > > Ok, I'll remove it for v6. Since the drivers needs to know it's doing it right, it might not hurt to add a sanity check helper for that. Just have the driver call it, and don't add it in the normal IO submission path.
On 05/09/18 02:19 PM, Jens Axboe wrote: > On 9/5/18 2:18 PM, Logan Gunthorpe wrote: >> >> >> On 05/09/18 02:14 PM, Jens Axboe wrote: >>> But if the caller must absolutely know where the bio will end up, then >>> it seems super redundant. So I'd vote for killing this check, it buys >>> us absolutely nothing and isn't even exhaustive in its current form. >> >> >> Ok, I'll remove it for v6. > > Since the drivers needs to know it's doing it right, it might not > hurt to add a sanity check helper for that. Just have the driver > call it, and don't add it in the normal IO submission path. I'm not sure I really see the value in that. It's the same principle in asking the driver to do the WARN: if the developer knew enough to use the special helper, they probably knew well enough to do the rest correctly. I guess one other thing to point out is that, on x86, if a driver submits P2P pages to a PCI device that doesn't have kernel support, everything will likely just work. Even though the driver isn't doing any of the work correctly and the requests are not being mapped with pci_p2pdma_map() functions. Such code on other arches would likely break. So developers may be lulled into thinking they're doing the correct thing when in fact they are not and the WARN in the common code would prevent that. Logan
On 9/5/18 2:32 PM, Logan Gunthorpe wrote: > > > On 05/09/18 02:19 PM, Jens Axboe wrote: >> On 9/5/18 2:18 PM, Logan Gunthorpe wrote: >>> >>> >>> On 05/09/18 02:14 PM, Jens Axboe wrote: >>>> But if the caller must absolutely know where the bio will end up, then >>>> it seems super redundant. So I'd vote for killing this check, it buys >>>> us absolutely nothing and isn't even exhaustive in its current form. >>> >>> >>> Ok, I'll remove it for v6. >> >> Since the drivers needs to know it's doing it right, it might not >> hurt to add a sanity check helper for that. Just have the driver >> call it, and don't add it in the normal IO submission path. > > I'm not sure I really see the value in that. It's the same principle in > asking the driver to do the WARN: if the developer knew enough to use > the special helper, they probably knew well enough to do the rest correctly. I don't agree with that at all. It's a "is my request valid" helper, it's not some obscure and rarely used functionality. You're making up this API right now, if you really want it done for every IO, make it part of the p2p submission process. You could even hide it behind a debug thing, if you like. > I guess one other thing to point out is that, on x86, if a driver > submits P2P pages to a PCI device that doesn't have kernel support, > everything will likely just work. Even though the driver isn't doing any > of the work correctly and the requests are not being mapped with > pci_p2pdma_map() functions. Such code on other arches would likely > break. So developers may be lulled into thinking they're doing the > correct thing when in fact they are not and the WARN in the common code > would prevent that. If you're adamant about having it in common code, put it in your common submission code. Most folks aren't going to care about P2P, let the ones that do have the checks.
On 05/09/18 02:36 PM, Jens Axboe wrote: > On 9/5/18 2:32 PM, Logan Gunthorpe wrote: >> >> >> On 05/09/18 02:19 PM, Jens Axboe wrote: >>> On 9/5/18 2:18 PM, Logan Gunthorpe wrote: >>>> >>>> >>>> On 05/09/18 02:14 PM, Jens Axboe wrote: >>>>> But if the caller must absolutely know where the bio will end up, then >>>>> it seems super redundant. So I'd vote for killing this check, it buys >>>>> us absolutely nothing and isn't even exhaustive in its current form. >>>> >>>> >>>> Ok, I'll remove it for v6. >>> >>> Since the drivers needs to know it's doing it right, it might not >>> hurt to add a sanity check helper for that. Just have the driver >>> call it, and don't add it in the normal IO submission path. >> >> I'm not sure I really see the value in that. It's the same principle in >> asking the driver to do the WARN: if the developer knew enough to use >> the special helper, they probably knew well enough to do the rest correctly. > > I don't agree with that at all. It's a "is my request valid" helper, > it's not some obscure and rarely used functionality. You're making up > this API right now, if you really want it done for every IO, make it > part of the p2p submission process. You could even hide it behind a > debug thing, if you like. There is no special p2p submission process. In the nvme-of case we are using the existing process and with the code in blk-core it didn't change it's process at all. Creating a helper will create one and I can look at making a pci_p2pdma_submit_bio() for v6; but if the developer screws up and still calls the regular submit_bio() things will only be very subtly broken and that won't be obvious. Logan
On Wed, Sep 05, 2018 at 03:03:18PM -0600, Logan Gunthorpe wrote: > There is no special p2p submission process. In the nvme-of case we are > using the existing process and with the code in blk-core it didn't > change it's process at all. Creating a helper will create one and I can > look at making a pci_p2pdma_submit_bio() for v6; but if the developer > screws up and still calls the regular submit_bio() things will only be > very subtly broken and that won't be obvious. I thought about that when reviewing the previous series, and even started hacking it up. In the end I decided against it for the above reason - it just adds code, but doesn't actually help with anything as it is trivial to forget, and not using it will in fact just work.
On 9/5/18 3:03 PM, Logan Gunthorpe wrote: > > > On 05/09/18 02:36 PM, Jens Axboe wrote: >> On 9/5/18 2:32 PM, Logan Gunthorpe wrote: >>> >>> >>> On 05/09/18 02:19 PM, Jens Axboe wrote: >>>> On 9/5/18 2:18 PM, Logan Gunthorpe wrote: >>>>> >>>>> >>>>> On 05/09/18 02:14 PM, Jens Axboe wrote: >>>>>> But if the caller must absolutely know where the bio will end up, then >>>>>> it seems super redundant. So I'd vote for killing this check, it buys >>>>>> us absolutely nothing and isn't even exhaustive in its current form. >>>>> >>>>> >>>>> Ok, I'll remove it for v6. >>>> >>>> Since the drivers needs to know it's doing it right, it might not >>>> hurt to add a sanity check helper for that. Just have the driver >>>> call it, and don't add it in the normal IO submission path. >>> >>> I'm not sure I really see the value in that. It's the same principle in >>> asking the driver to do the WARN: if the developer knew enough to use >>> the special helper, they probably knew well enough to do the rest correctly. >> >> I don't agree with that at all. It's a "is my request valid" helper, >> it's not some obscure and rarely used functionality. You're making up >> this API right now, if you really want it done for every IO, make it >> part of the p2p submission process. You could even hide it behind a >> debug thing, if you like. > > There is no special p2p submission process. In the nvme-of case we are > using the existing process and with the code in blk-core it didn't > change it's process at all. Creating a helper will create one and I can > look at making a pci_p2pdma_submit_bio() for v6; but if the developer > screws up and still calls the regular submit_bio() things will only be > very subtly broken and that won't be obvious. I'm very sure that something that basic will be caught in review. I don't care if you wrap the submission or just require the caller to call some validity helper check first, fwiw. And I think we're done beating the dead horse at this point.
On Wed, Sep 05, 2018 at 03:03:18PM -0600, Logan Gunthorpe wrote: > There is no special p2p submission process. In the nvme-of case we are > using the existing process and with the code in blk-core it didn't > change it's process at all. Creating a helper will create one and I can > look at making a pci_p2pdma_submit_bio() for v6; but if the developer > screws up and still calls the regular submit_bio() things will only be > very subtly broken and that won't be obvious. I just saw you added that "helper" in your tree. Please don't, it is a negative value add as it doesn't help anything with the checking.
On 10/09/18 10:41 AM, Christoph Hellwig wrote: > On Wed, Sep 05, 2018 at 03:03:18PM -0600, Logan Gunthorpe wrote: >> There is no special p2p submission process. In the nvme-of case we are >> using the existing process and with the code in blk-core it didn't >> change it's process at all. Creating a helper will create one and I can >> look at making a pci_p2pdma_submit_bio() for v6; but if the developer >> screws up and still calls the regular submit_bio() things will only be >> very subtly broken and that won't be obvious. > > I just saw you added that "helper" in your tree. Please don't, it is > a negative value add as it doesn't help anything with the checking. Alright, so what's the consensus then? Just have a check in nvmet_bdev_execute_rw() to add REQ_NOMERGE when appropriate? Jens is pretty dead set against adding to the common path. Logan P.S. Here's the commit in question for anyone else on the list: https://github.com/sbates130272/linux-p2pmem/commit/eeabe0bc94491d3eec4fe872274a9e3b4cdea538
On Mon, Sep 10, 2018 at 12:11:01PM -0600, Logan Gunthorpe wrote: > > I just saw you added that "helper" in your tree. Please don't, it is > > a negative value add as it doesn't help anything with the checking. > > Alright, so what's the consensus then? Just have a check in > nvmet_bdev_execute_rw() to add REQ_NOMERGE when appropriate? Yes.
diff --git a/block/blk-core.c b/block/blk-core.c index dee56c282efb..cc0289c7b983 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2264,6 +2264,20 @@ generic_make_request_checks(struct bio *bio) if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_rq_based(q)) goto not_supported; + /* + * Ensure PCI P2PDMA memory is not used in requests to queues that + * have no support. This should never happen if the higher layers using + * P2PDMA do the right thing and use the proper P2PDMA client + * infrastructure. Also, ensure such requests use REQ_NOMERGE + * seeing requests can not mix P2PDMA and non-P2PDMA memory at + * this time. + */ + if (bio->bi_vcnt && is_pci_p2pdma_page(bio->bi_io_vec->bv_page)) { + if (WARN_ON_ONCE(!blk_queue_pci_p2pdma(q))) + goto not_supported; + bio->bi_opf |= REQ_NOMERGE; + } + if (should_fail_bio(bio)) goto end_io; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index d6869e0e2b64..7bf80ca802e1 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -699,6 +699,7 @@ struct request_queue { #define QUEUE_FLAG_SCSI_PASSTHROUGH 27 /* queue supports SCSI commands */ #define QUEUE_FLAG_QUIESCED 28 /* queue has been quiesced */ #define QUEUE_FLAG_PREEMPT_ONLY 29 /* only process REQ_PREEMPT requests */ +#define QUEUE_FLAG_PCI_P2PDMA 30 /* device supports pci p2p requests */ #define QUEUE_FLAG_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \ (1 << QUEUE_FLAG_SAME_COMP) | \ @@ -731,6 +732,8 @@ bool blk_queue_flag_test_and_clear(unsigned int flag, struct request_queue *q); #define blk_queue_dax(q) test_bit(QUEUE_FLAG_DAX, &(q)->queue_flags) #define blk_queue_scsi_passthrough(q) \ test_bit(QUEUE_FLAG_SCSI_PASSTHROUGH, &(q)->queue_flags) +#define blk_queue_pci_p2pdma(q) \ + test_bit(QUEUE_FLAG_PCI_P2PDMA, &(q)->queue_flags) #define blk_noretry_request(rq) \ ((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \
QUEUE_FLAG_PCI_P2P is introduced meaning a driver's request queue supports targeting P2P memory. When a request is submitted we check if PCI P2PDMA memory is assigned to the first page in the bio. If it is, we ensure the queue it's submitted to supports it, and enforce REQ_NOMERGE. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> --- block/blk-core.c | 14 ++++++++++++++ include/linux/blkdev.h | 3 +++ 2 files changed, 17 insertions(+)