Message ID | 20190909125658.30559-2-thierry.reding@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mmc: Fix scatter/gather on SDHCI | expand |
On Mon, Sep 09, 2019 at 02:56:56PM +0200, Thierry Reding wrote: > From: Thierry Reding <treding@nvidia.com> > > When enabling the DMA map merging capability for a queue, ensure that > the maximum segment size does not exceed the device's limit. We can't do that unfortunately. If we use the virt_boundary setting we do aggressive merges that there is no accounting for. So we can't limit the segment size. And at least for the case how devices usually do the addressing (page based on not sgl based) that needs the virt_boundary there isn't really any concept like a segment anyway.
On Mon, Sep 09, 2019 at 06:13:31PM +0200, Christoph Hellwig wrote: > On Mon, Sep 09, 2019 at 02:56:56PM +0200, Thierry Reding wrote: > > From: Thierry Reding <treding@nvidia.com> > > > > When enabling the DMA map merging capability for a queue, ensure that > > the maximum segment size does not exceed the device's limit. > > We can't do that unfortunately. If we use the virt_boundary setting > we do aggressive merges that there is no accounting for. So we can't > limit the segment size. But that's kind of the point here. My understanding is that the maximum segment size in the device's DMA parameters defines the maximum size of the segment that the device can handle. In the particular case that I'm trying to fix, according to the SDHCI specification, these devices can handle segments that are a maximum of 64 KiB in size. If we allow that segment size to be exceeded, the device will no longer be able to handle them. > And at least for the case how devices usually do the addressing > (page based on not sgl based) that needs the virt_boundary there isn't > really any concept like a segment anyway. I do understand that aspect of it. However, devices that do the addressing this way, wouldn't they want to set the maximum segment size to something large (like UINT_MAX, which many users that don't have the concept of a segment already do)? If you disagree, do you have any alternative proposals other than reverting the offending patch? linux-next is currently broken on all systems where the SDHCI controller is behind an IOMMU. Thierry
Hi Thierry, > From: Thierry Reding, Sent: Tuesday, September 10, 2019 4:19 AM > > On Mon, Sep 09, 2019 at 06:13:31PM +0200, Christoph Hellwig wrote: > > On Mon, Sep 09, 2019 at 02:56:56PM +0200, Thierry Reding wrote: > > > From: Thierry Reding <treding@nvidia.com> > > > > > > When enabling the DMA map merging capability for a queue, ensure that > > > the maximum segment size does not exceed the device's limit. > > > > We can't do that unfortunately. If we use the virt_boundary setting > > we do aggressive merges that there is no accounting for. So we can't > > limit the segment size. > > But that's kind of the point here. My understanding is that the maximum > segment size in the device's DMA parameters defines the maximum size of > the segment that the device can handle. > > In the particular case that I'm trying to fix, according to the SDHCI > specification, these devices can handle segments that are a maximum of > 64 KiB in size. If we allow that segment size to be exceeded, the device > will no longer be able to handle them. > > > And at least for the case how devices usually do the addressing > > (page based on not sgl based) that needs the virt_boundary there isn't > > really any concept like a segment anyway. > > I do understand that aspect of it. However, devices that do the > addressing this way, wouldn't they want to set the maximum segment size > to something large (like UINT_MAX, which many users that don't have the > concept of a segment already do)? > > If you disagree, do you have any alternative proposals other than > reverting the offending patch? linux-next is currently broken on all > systems where the SDHCI controller is behind an IOMMU. I'm sorry for causing this trouble on your environment. I have a proposal to resolve this issue. The mmc_host struct will have a new caps2 flag like MMC_CAP2_MERGE_CAPABLE and add a condition into the queue.c like below. + if (host->caps2 & MMC_CAP2_MERGE_CAPABLE && + host->max_segs < MMC_DMA_MAP_MERGE_SEGMENTS && dma_get_merge_boundary(mmc_dev(host))) host->can_dma_map_merge = 1; else host->can_dma_map_merge = 0; After that, all mmc controllers disable the feature as default, and if a mmc controller has such capable, the host driver should set the flag. Ulf, is such a patch acceptable for v5.4-rcN? Anyway, I'll submit such a patch later. Best regards, Yoshihiro Shimoda > Thierry
On Tue, Sep 10, 2019 at 02:03:17AM +0000, Yoshihiro Shimoda wrote: > I'm sorry for causing this trouble on your environment. I have a proposal to > resolve this issue. The mmc_host struct will have a new caps2 flag > like MMC_CAP2_MERGE_CAPABLE and add a condition into the queue.c like below. > > + if (host->caps2 & MMC_CAP2_MERGE_CAPABLE && > + host->max_segs < MMC_DMA_MAP_MERGE_SEGMENTS && > dma_get_merge_boundary(mmc_dev(host))) > host->can_dma_map_merge = 1; > else > host->can_dma_map_merge = 0; > > After that, all mmc controllers disable the feature as default, and if a mmc > controller has such capable, the host driver should set the flag. That sounds sensible to me. Alternatively we'd have to limit max_sectors to 16-bit values for sdhci if using an iommu that can merge.
On Tue, Sep 10, 2019 at 02:03:17AM +0000, Yoshihiro Shimoda wrote: > Hi Thierry, > > > From: Thierry Reding, Sent: Tuesday, September 10, 2019 4:19 AM > > > > On Mon, Sep 09, 2019 at 06:13:31PM +0200, Christoph Hellwig wrote: > > > On Mon, Sep 09, 2019 at 02:56:56PM +0200, Thierry Reding wrote: > > > > From: Thierry Reding <treding@nvidia.com> > > > > > > > > When enabling the DMA map merging capability for a queue, ensure that > > > > the maximum segment size does not exceed the device's limit. > > > > > > We can't do that unfortunately. If we use the virt_boundary setting > > > we do aggressive merges that there is no accounting for. So we can't > > > limit the segment size. > > > > But that's kind of the point here. My understanding is that the maximum > > segment size in the device's DMA parameters defines the maximum size of > > the segment that the device can handle. > > > > In the particular case that I'm trying to fix, according to the SDHCI > > specification, these devices can handle segments that are a maximum of > > 64 KiB in size. If we allow that segment size to be exceeded, the device > > will no longer be able to handle them. > > > > > And at least for the case how devices usually do the addressing > > > (page based on not sgl based) that needs the virt_boundary there isn't > > > really any concept like a segment anyway. > > > > I do understand that aspect of it. However, devices that do the > > addressing this way, wouldn't they want to set the maximum segment size > > to something large (like UINT_MAX, which many users that don't have the > > concept of a segment already do)? > > > > If you disagree, do you have any alternative proposals other than > > reverting the offending patch? linux-next is currently broken on all > > systems where the SDHCI controller is behind an IOMMU. > > I'm sorry for causing this trouble on your environment. I have a proposal to > resolve this issue. The mmc_host struct will have a new caps2 flag > like MMC_CAP2_MERGE_CAPABLE and add a condition into the queue.c like below. > > + if (host->caps2 & MMC_CAP2_MERGE_CAPABLE && > + host->max_segs < MMC_DMA_MAP_MERGE_SEGMENTS && > dma_get_merge_boundary(mmc_dev(host))) > host->can_dma_map_merge = 1; > else > host->can_dma_map_merge = 0; > > After that, all mmc controllers disable the feature as default, and if a mmc > controller has such capable, the host driver should set the flag. > Ulf, is such a patch acceptable for v5.4-rcN? Anyway, I'll submit such a patch later. I'm sure that would work, but I think that's missing the point. It's not that the setup isn't capable of merging at all. It just can't deal with segments that are too large. While I was debugging this, I was frequently seeing cases where the SG was on the order of 40 entries initially and after dma_map_sg() it was reduced to just 4 or 5. So I think merging is still really useful if a setup supports it via an IOMMU. I'm just not sure I see why we can't make the code respect what- ever the maximum segment size that the driver may have configured. That seems like it should allow us to get the best of both worlds. Thierry
On Tue, Sep 10, 2019 at 08:13:48AM +0200, Christoph Hellwig wrote: > On Tue, Sep 10, 2019 at 02:03:17AM +0000, Yoshihiro Shimoda wrote: > > I'm sorry for causing this trouble on your environment. I have a proposal to > > resolve this issue. The mmc_host struct will have a new caps2 flag > > like MMC_CAP2_MERGE_CAPABLE and add a condition into the queue.c like below. > > > > + if (host->caps2 & MMC_CAP2_MERGE_CAPABLE && > > + host->max_segs < MMC_DMA_MAP_MERGE_SEGMENTS && > > dma_get_merge_boundary(mmc_dev(host))) > > host->can_dma_map_merge = 1; > > else > > host->can_dma_map_merge = 0; > > > > After that, all mmc controllers disable the feature as default, and if a mmc > > controller has such capable, the host driver should set the flag. > > That sounds sensible to me. Alternatively we'd have to limit > max_sectors to 16-bit values for sdhci if using an iommu that can > merge. Isn't that effectively what dma_set_max_seg_size() is supposed to be doing? That tells the DMA API what the maximum size of a segment can be for the given device, right? If we make sure never to exceed that when compacting the SG, the SG that we get back should map just fine into the descriptors that SDHCI supports. Now, for devices that support larger segments (or in fact no concept of segments at all), if we set the maximum segment size to something really big, isn't that going to automatically allow the huge, merged segments that the original patch intended? To me this sounds simply like an issue of the queue code thinking it knows better than the driver and just overriding the maximum segment size. Isn't that the real bug here that needs to be fixed? Thierry
Hi Thierry, > From: Thierry Reding, Sent: Tuesday, September 10, 2019 4:31 PM <snip> > On Tue, Sep 10, 2019 at 02:03:17AM +0000, Yoshihiro Shimoda wrote: > > Hi Thierry, > > > > > From: Thierry Reding, Sent: Tuesday, September 10, 2019 4:19 AM > > > > > > On Mon, Sep 09, 2019 at 06:13:31PM +0200, Christoph Hellwig wrote: > > > > On Mon, Sep 09, 2019 at 02:56:56PM +0200, Thierry Reding wrote: > > > > > From: Thierry Reding <treding@nvidia.com> > > > > > > > > > > When enabling the DMA map merging capability for a queue, ensure that > > > > > the maximum segment size does not exceed the device's limit. > > > > > > > > We can't do that unfortunately. If we use the virt_boundary setting > > > > we do aggressive merges that there is no accounting for. So we can't > > > > limit the segment size. > > > > > > But that's kind of the point here. My understanding is that the maximum > > > segment size in the device's DMA parameters defines the maximum size of > > > the segment that the device can handle. > > > > > > In the particular case that I'm trying to fix, according to the SDHCI > > > specification, these devices can handle segments that are a maximum of > > > 64 KiB in size. If we allow that segment size to be exceeded, the device > > > will no longer be able to handle them. > > > > > > > And at least for the case how devices usually do the addressing > > > > (page based on not sgl based) that needs the virt_boundary there isn't > > > > really any concept like a segment anyway. > > > > > > I do understand that aspect of it. However, devices that do the > > > addressing this way, wouldn't they want to set the maximum segment size > > > to something large (like UINT_MAX, which many users that don't have the > > > concept of a segment already do)? > > > > > > If you disagree, do you have any alternative proposals other than > > > reverting the offending patch? linux-next is currently broken on all > > > systems where the SDHCI controller is behind an IOMMU. > > > > I'm sorry for causing this trouble on your environment. I have a proposal to > > resolve this issue. The mmc_host struct will have a new caps2 flag > > like MMC_CAP2_MERGE_CAPABLE and add a condition into the queue.c like below. > > > > + if (host->caps2 & MMC_CAP2_MERGE_CAPABLE && > > + host->max_segs < MMC_DMA_MAP_MERGE_SEGMENTS && > > dma_get_merge_boundary(mmc_dev(host))) > > host->can_dma_map_merge = 1; > > else > > host->can_dma_map_merge = 0; > > > > After that, all mmc controllers disable the feature as default, and if a mmc > > controller has such capable, the host driver should set the flag. > > Ulf, is such a patch acceptable for v5.4-rcN? Anyway, I'll submit such a patch later. > > I'm sure that would work, but I think that's missing the point. It's not > that the setup isn't capable of merging at all. It just can't deal with > segments that are too large. IIUC, since SDHCI has a strictly 64 KiB limitation on each segment, the controller cannot handle the following example 1 case on the plain next-20190904. For example 1: - Original scatter lists are 4 segments: sg[0]: .dma_address = 0x12340000, .length = 65536, sg[1]: .dma_address = 0x12350000, .length = 65536, sg[2]: .dma_address = 0x12360000, .length = 65536, sg[3]: .dma_address = 0x12370000, .length = 65536, - Merging the above segments will be a segment: sg[0]: .dma_address = 0x12340000, .length = 262144, > While I was debugging this, I was frequently seeing cases where the SG > was on the order of 40 entries initially and after dma_map_sg() it was > reduced to just 4 or 5. If each segment size is small, it can merge them. For example 2: - Original scatter lists are 4 segments: sg[0]: .dma_address = 0x12340000, .length = 4096, sg[1]: .dma_address = 0x12341000, .length = 4096, sg[2]: .dma_address = 0x12342000, .length = 4096, sg[3]: .dma_address = 0x12343000, .length = 4096, - Merging the above segments will be a segment: sg[0]: .dma_address = 0x12340000, .length = 16384, > So I think merging is still really useful if a setup supports it via an > IOMMU. I'm just not sure I see why we can't make the code respect what- > ever the maximum segment size that the driver may have configured. That > seems like it should allow us to get the best of both worlds. I agree about merging is useful for the case of the "example 2". By the way, I checked dma-iommu.c ,and then I found the __finalise_sg() has a condition "seg_mask" that is from dma_get_seg_boundary(). So, I'm guessing if the sdhci.c calls dma_set_seg_boundary() with 0x0000ffff, the issue disappears. This is because the dma-iommu.c will not merge the segments even if the case of example 1. What do you think? Best regards, Yoshihiro Shimoda > Thierry
On Tue, Sep 10, 2019 at 09:37:39AM +0200, Thierry Reding wrote: > > > After that, all mmc controllers disable the feature as default, and if a mmc > > > controller has such capable, the host driver should set the flag. > > > > That sounds sensible to me. Alternatively we'd have to limit > > max_sectors to 16-bit values for sdhci if using an iommu that can > > merge. > > Isn't that effectively what dma_set_max_seg_size() is supposed to be > doing? That tells the DMA API what the maximum size of a segment can > be for the given device, right? If we make sure never to exceed that > when compacting the SG, the SG that we get back should map just fine > into the descriptors that SDHCI supports. dma_set_max_seg_size() does indeed instruct the iommu drivers about the merging capabilities (btw, swiotlb should be able to implement this kind of merging as well, but that is a different discussion). But the problem is that you don't just change the dma_set_max_seg_size, but also the block layer max segment size setting, which is used for block layer merges. And we don't have the accounting for the first and last segment in a request (those that are being merged to), so if you enable the virt_boundary segments can grow to a size only limited by the maximum request size. We could add that accounting with a bit of work, it's just that for devices that typicall use the virt boundary there is no point (their actually segment is a page and not related to the Linux "segment").
On Tue, Sep 10, 2019 at 02:03:17AM +0000, Yoshihiro Shimoda wrote: > I'm sorry for causing this trouble on your environment. I have a proposal to > resolve this issue. The mmc_host struct will have a new caps2 flag > like MMC_CAP2_MERGE_CAPABLE and add a condition into the queue.c like below. > > + if (host->caps2 & MMC_CAP2_MERGE_CAPABLE && > + host->max_segs < MMC_DMA_MAP_MERGE_SEGMENTS && > dma_get_merge_boundary(mmc_dev(host))) > host->can_dma_map_merge = 1; > else > host->can_dma_map_merge = 0; > > After that, all mmc controllers disable the feature as default, and if a mmc > controller has such capable, the host driver should set the flag. > Ulf, is such a patch acceptable for v5.4-rcN? Anyway, I'll submit such a patch later. FYI, I'd love to see such a patch, even if only as a temporary band-aid so that I don't have to revert the series before sending the pull request to Linus.
On Mon, Sep 09, 2019 at 06:13:31PM +0200, Christoph Hellwig wrote: > On Mon, Sep 09, 2019 at 02:56:56PM +0200, Thierry Reding wrote: > > From: Thierry Reding <treding@nvidia.com> > > > > When enabling the DMA map merging capability for a queue, ensure that > > the maximum segment size does not exceed the device's limit. > > We can't do that unfortunately. If we use the virt_boundary setting > we do aggressive merges that there is no accounting for. So we can't > limit the segment size. Could you explain a bit why we can't do that? The segment size limit is basically removed since the following commit 200a9aff7b02 ("block: remove the segment size check in bio_will_gap"). Before that commit, the max segment size limit worked. Thanks, Ming
On Thu, Sep 12, 2019 at 08:57:29AM +0800, Ming Lei wrote: > Could you explain a bit why we can't do that? > > The segment size limit is basically removed since the following commit > 200a9aff7b02 ("block: remove the segment size check in bio_will_gap"). > > Before that commit, the max segment size limit worked. No, it didn't as explained in the commit log. It worked for some cases but not others.
diff --git a/block/blk-settings.c b/block/blk-settings.c index 70b39f14e974..9fb1288fbc12 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -738,12 +738,8 @@ void blk_queue_segment_boundary(struct request_queue *q, unsigned long mask) } EXPORT_SYMBOL(blk_queue_segment_boundary); -/** - * blk_queue_virt_boundary - set boundary rules for bio merging - * @q: the request queue for the device - * @mask: the memory boundary mask - **/ -void blk_queue_virt_boundary(struct request_queue *q, unsigned long mask) +void __blk_queue_virt_boundary(struct request_queue *q, unsigned long mask, + unsigned int max_segment_size) { q->limits.virt_boundary_mask = mask; @@ -754,7 +750,17 @@ void blk_queue_virt_boundary(struct request_queue *q, unsigned long mask) * of that they are not limited by our notion of "segment size". */ if (mask) - q->limits.max_segment_size = UINT_MAX; + q->limits.max_segment_size = max_segment_size; +} + +/** + * blk_queue_virt_boundary - set boundary rules for bio merging + * @q: the request queue for the device + * @mask: the memory boundary mask + **/ +void blk_queue_virt_boundary(struct request_queue *q, unsigned long mask) +{ + __blk_queue_virt_boundary(q, mask, UINT_MAX); } EXPORT_SYMBOL(blk_queue_virt_boundary); @@ -843,13 +849,13 @@ EXPORT_SYMBOL_GPL(blk_queue_write_cache); bool blk_queue_can_use_dma_map_merging(struct request_queue *q, struct device *dev) { + unsigned int max_segment_size = dma_get_max_seg_size(dev); unsigned long boundary = dma_get_merge_boundary(dev); if (!boundary) return false; - /* No need to update max_segment_size. see blk_queue_virt_boundary() */ - blk_queue_virt_boundary(q, boundary); + __blk_queue_virt_boundary(q, boundary, max_segment_size); return true; }