Message ID | 20250310-dmem-cgroups-v1-6-2984c1bc9312@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | dma: Enable dmem cgroup tracking | expand |
On 2025-03-10 12:06 pm, Maxime Ripard wrote: > Consumers of the direct DMA API will have to know which region their > device allocate from in order for them to charge the memory allocation > in the right one. This doesn't seem to make much sense - dma-direct is not an allocator itself, it just provides the high-level dma_alloc_attrs/dma_alloc_pages/etc. interfaces wherein the underlying allocations _could_ come from CMA, but also a per-device coherent/restricted pool, or a global coherent/atomic pool, or the regular page allocator, or in one weird corner case the SWIOTLB buffer, or... Thanks, Robin. > Let's provide an accessor for that region. > > Signed-off-by: Maxime Ripard <mripard@kernel.org> > --- > include/linux/dma-direct.h | 2 ++ > kernel/dma/direct.c | 8 ++++++++ > 2 files changed, 10 insertions(+) > > diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h > index d7e30d4f7503a898a456df8eedf6a2cd284c35ff..2dd7cbccfaeed81c18c67aae877417fe89f2f2f5 100644 > --- a/include/linux/dma-direct.h > +++ b/include/linux/dma-direct.h > @@ -145,6 +145,8 @@ void dma_direct_free_pages(struct device *dev, size_t size, > enum dma_data_direction dir); > int dma_direct_supported(struct device *dev, u64 mask); > dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr, > size_t size, enum dma_data_direction dir, unsigned long attrs); > > +struct dmem_cgroup_region *dma_direct_get_dmem_cgroup_region(struct device *dev); > + > #endif /* _LINUX_DMA_DIRECT_H */ > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > index 5b4e6d3bf7bcca8930877ba078aed4ce26828f06..ece1361077b6efeec5b202d838750afd967d473f 100644 > --- a/kernel/dma/direct.c > +++ b/kernel/dma/direct.c > @@ -42,10 +42,18 @@ u64 dma_direct_get_required_mask(struct device *dev) > u64 max_dma = phys_to_dma_direct(dev, phys); > > return (1ULL << (fls64(max_dma) - 1)) * 2 - 1; > } > > +#if IS_ENABLED(CONFIG_CGROUP_DMEM) > +struct dmem_cgroup_region * > +dma_direct_get_dmem_cgroup_region(struct device *dev) > +{ > + return dma_contiguous_get_dmem_cgroup_region(dev); > +} > +#endif > + > static gfp_t dma_direct_optimal_gfp_mask(struct device *dev, u64 *phys_limit) > { > u64 dma_limit = min_not_zero( > dev->coherent_dma_mask, > dev->bus_dma_limit); >
On Mon, Mar 10, 2025 at 02:56:37PM +0000, Robin Murphy wrote: > On 2025-03-10 12:06 pm, Maxime Ripard wrote: > > Consumers of the direct DMA API will have to know which region their > > device allocate from in order for them to charge the memory allocation > > in the right one. > > This doesn't seem to make much sense - dma-direct is not an allocator > itself, it just provides the high-level dma_alloc_attrs/dma_alloc_pages/etc. > interfaces wherein the underlying allocations _could_ come from CMA, but > also a per-device coherent/restricted pool, or a global coherent/atomic > pool, or the regular page allocator, or in one weird corner case the SWIOTLB > buffer, or... I guess it wasn't super clear, but what I meant is that it's an allocator to the consumer: it gets called, and returns a buffer. How it does so is transparent to the device, and on the other side of the abstraction. I do agree that the logic is complicated to follow, and that's what I was getting at in the cover letter. Maxime
On 2025-03-10 4:28 pm, Maxime Ripard wrote: > On Mon, Mar 10, 2025 at 02:56:37PM +0000, Robin Murphy wrote: >> On 2025-03-10 12:06 pm, Maxime Ripard wrote: >>> Consumers of the direct DMA API will have to know which region their >>> device allocate from in order for them to charge the memory allocation >>> in the right one. >> >> This doesn't seem to make much sense - dma-direct is not an allocator >> itself, it just provides the high-level dma_alloc_attrs/dma_alloc_pages/etc. >> interfaces wherein the underlying allocations _could_ come from CMA, but >> also a per-device coherent/restricted pool, or a global coherent/atomic >> pool, or the regular page allocator, or in one weird corner case the SWIOTLB >> buffer, or... > > I guess it wasn't super clear, but what I meant is that it's an > allocator to the consumer: it gets called, and returns a buffer. How it > does so is transparent to the device, and on the other side of the > abstraction. > > I do agree that the logic is complicated to follow, and that's what I > was getting at in the cover letter. Right, but ultimately my point is that when we later end up with: struct dmem_cgroup_region * dma_get_dmem_cgroup_region(struct device *dev) { if (dma_alloc_direct(dev, get_dma_ops(dev))) return dma_direct_get_dmem_cgroup_region(dev); = dma_contiguous_get_dmem_cgroup_region(dev); it's objectively wrong given what dma_alloc_direct() means in context: void *dma_alloc_attrs(...) { if (dma_alloc_direct(dev, ops)) cpu_addr = dma_direct_alloc(...); where dma_direct_alloc() may then use at least 5 different allocation methods, only one of which is CMA. Accounting things which are not CMA to CMA seems to thoroughly defeat the purpose of having such fine-grained accounting at all. This is why the very notion of "consumers of dma-direct" should fundamentally not be a thing IMO. Drivers consume the DMA API interfaces, and the DMA API ultimately consumes various memory allocators, but what happens in between is nobody else's business; dma-direct happens to represent *some* paths between the two, but there are plenty more paths to the same (and different) allocators through other DMA API implementations as well. Which route a particular call takes to end up at a particular allocator is not meaningful unless you are the DMA ops dispatch code. Or to put it another way, to even go for the "dumbest possible correct solution", the plumbing of dma_get_dmem_cgroup_region() would need to be about as complex and widespread as the plumbing of dma_alloc_attrs() itself ;) I think I see why a simple DMA attribute couldn't be made to work, as dmem_cgroup_uncharge() can't simply look up the pool the same way dmem_cgroup_try_charge() found it, since we still need a cg for that and get_current_dmemcs() can't be assumed to be stable over time, right? At the point I'm probably starting to lean towards a whole new DMA op with a properly encapsulated return type (and maybe a long-term goal of consolidating the 3 or 4 different allocation type we already have), or just have a single dmem region for "DMA API memory" and don't care where it came from (although I do see the issues with that too - you probably wouldn't want to ration a device-private pool the same way as global system memory, for example) Thanks, Robin.
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index d7e30d4f7503a898a456df8eedf6a2cd284c35ff..2dd7cbccfaeed81c18c67aae877417fe89f2f2f5 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -145,6 +145,8 @@ void dma_direct_free_pages(struct device *dev, size_t size, enum dma_data_direction dir); int dma_direct_supported(struct device *dev, u64 mask); dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr, size_t size, enum dma_data_direction dir, unsigned long attrs); +struct dmem_cgroup_region *dma_direct_get_dmem_cgroup_region(struct device *dev); + #endif /* _LINUX_DMA_DIRECT_H */ diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 5b4e6d3bf7bcca8930877ba078aed4ce26828f06..ece1361077b6efeec5b202d838750afd967d473f 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -42,10 +42,18 @@ u64 dma_direct_get_required_mask(struct device *dev) u64 max_dma = phys_to_dma_direct(dev, phys); return (1ULL << (fls64(max_dma) - 1)) * 2 - 1; } +#if IS_ENABLED(CONFIG_CGROUP_DMEM) +struct dmem_cgroup_region * +dma_direct_get_dmem_cgroup_region(struct device *dev) +{ + return dma_contiguous_get_dmem_cgroup_region(dev); +} +#endif + static gfp_t dma_direct_optimal_gfp_mask(struct device *dev, u64 *phys_limit) { u64 dma_limit = min_not_zero( dev->coherent_dma_mask, dev->bus_dma_limit);
Consumers of the direct DMA API will have to know which region their device allocate from in order for them to charge the memory allocation in the right one. Let's provide an accessor for that region. Signed-off-by: Maxime Ripard <mripard@kernel.org> --- include/linux/dma-direct.h | 2 ++ kernel/dma/direct.c | 8 ++++++++ 2 files changed, 10 insertions(+)