diff mbox series

[RFC,v3,5/8] dmabuf: Add gpu cgroup charge transfer function

Message ID 20220309165222.2843651-6-tjmercier@google.com (mailing list archive)
State New
Headers show
Series Proposal for a GPU cgroup controller | expand

Commit Message

T.J. Mercier March 9, 2022, 4:52 p.m. UTC
From: Hridya Valsaraju <hridya@google.com>

The dma_buf_charge_transfer function provides a way for processes to
transfer charge of a buffer to a different process. This is essential
for the cases where a central allocator process does allocations for
various subsystems, hands over the fd to the client who requested the
memory and drops all references to the allocated memory.

Signed-off-by: Hridya Valsaraju <hridya@google.com>
Signed-off-by: T.J. Mercier <tjmercier@google.com>

---
v3 changes
Use more common dual author commit message format per John Stultz.

v2 changes
Move dma-buf cgroup charge transfer from a dma_buf_op defined by every
heap to a single dma-buf function for all heaps per Daniel Vetter and
Christian König.
---
 drivers/dma-buf/dma-buf.c | 48 +++++++++++++++++++++++++++++++++++++++
 include/linux/dma-buf.h   |  2 ++
 2 files changed, 50 insertions(+)

Comments

Michal Koutný March 21, 2022, 5:45 p.m. UTC | #1
Hello.

On Wed, Mar 09, 2022 at 04:52:15PM +0000, "T.J. Mercier" <tjmercier@google.com> wrote:
> +int dma_buf_charge_transfer(struct dma_buf *dmabuf, struct gpucg *gpucg)
> +{
> +#ifdef CONFIG_CGROUP_GPU
> +	struct gpucg *current_gpucg;
> +	int ret = 0;
> +
> +	/*
> +	 * Verify that the cgroup of the process requesting the transfer is the
> +	 * same as the one the buffer is currently charged to.
> +	 */
> +	current_gpucg = gpucg_get(current);
> +	mutex_lock(&dmabuf->lock);
> +	if (current_gpucg != dmabuf->gpucg) {
> +		ret = -EPERM;
> +		goto err;
> +	}

Add a shortcut for gpucg == current_gpucg?

> +
> +	ret = gpucg_try_charge(gpucg, dmabuf->gpucg_dev, dmabuf->size);
> +	if (ret)
> +		goto err;
> +
> +	dmabuf->gpucg = gpucg;
> +
> +	/* uncharge the buffer from the cgroup it's currently charged to. */
> +	gpucg_uncharge(current_gpucg, dmabuf->gpucg_dev, dmabuf->size);

I think gpucg_* API would need to cater for such transfers too since
possibly transitional breach of a limit during the transfer may
unnecessarily fail the operation.

My 0.02€,
Michal
T.J. Mercier March 21, 2022, 11:54 p.m. UTC | #2
On Mon, Mar 21, 2022 at 10:45 AM Michal Koutný <mkoutny@suse.com> wrote:
>
> Hello.
>
> On Wed, Mar 09, 2022 at 04:52:15PM +0000, "T.J. Mercier" <tjmercier@google.com> wrote:
> > +int dma_buf_charge_transfer(struct dma_buf *dmabuf, struct gpucg *gpucg)
> > +{
> > +#ifdef CONFIG_CGROUP_GPU
> > +     struct gpucg *current_gpucg;
> > +     int ret = 0;
> > +
> > +     /*
> > +      * Verify that the cgroup of the process requesting the transfer is the
> > +      * same as the one the buffer is currently charged to.
> > +      */
> > +     current_gpucg = gpucg_get(current);
> > +     mutex_lock(&dmabuf->lock);
> > +     if (current_gpucg != dmabuf->gpucg) {
> > +             ret = -EPERM;
> > +             goto err;
> > +     }
>
> Add a shortcut for gpucg == current_gpucg?

Good idea, thank you!

>
> > +
> > +     ret = gpucg_try_charge(gpucg, dmabuf->gpucg_dev, dmabuf->size);
> > +     if (ret)
> > +             goto err;
> > +
> > +     dmabuf->gpucg = gpucg;
> > +
> > +     /* uncharge the buffer from the cgroup it's currently charged to. */
> > +     gpucg_uncharge(current_gpucg, dmabuf->gpucg_dev, dmabuf->size);
>
> I think gpucg_* API would need to cater for such transfers too since
> possibly transitional breach of a limit during the transfer may
> unnecessarily fail the operation.

Since the charge is duplicated in two cgroups for a short period
before it is uncharged from the source cgroup I guess the situation
you're thinking about is a global (or common ancestor) limit? I can
see how that would be a problem for transfers done this way and an
alternative would be to swap the order of the charge operations: first
uncharge, then try_charge. To be certain the uncharge is reversible if
the try_charge fails, I think I'd need either a mutex used at all
gpucg_*charge call sites or access to the gpucg_mutex, which implies
adding transfer support to gpu.c as part of the gpucg_* API itself and
calling it here. Am I following correctly here?

This series doesn't actually add limit support just accounting, but
I'd like to get it right here.

>
> My 0.02€,
> Michal

Thanks!
diff mbox series

Patch

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 83d0d1b91547..55e1b982f840 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1374,6 +1374,54 @@  void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map)
 }
 EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap, DMA_BUF);
 
+/**
+ * dma_buf_charge_transfer - Change the GPU cgroup to which the provided dma_buf
+ * is charged.
+ * @dmabuf:	[in]	buffer whose charge will be migrated to a different GPU
+ *			cgroup
+ * @gpucg:	[in]	the destination GPU cgroup for dmabuf's charge
+ *
+ * Only tasks that belong to the same cgroup the buffer is currently charged to
+ * may call this function, otherwise it will return -EPERM.
+ *
+ * Returns 0 on success, or a negative errno code otherwise.
+ */
+int dma_buf_charge_transfer(struct dma_buf *dmabuf, struct gpucg *gpucg)
+{
+#ifdef CONFIG_CGROUP_GPU
+	struct gpucg *current_gpucg;
+	int ret = 0;
+
+	/*
+	 * Verify that the cgroup of the process requesting the transfer is the
+	 * same as the one the buffer is currently charged to.
+	 */
+	current_gpucg = gpucg_get(current);
+	mutex_lock(&dmabuf->lock);
+	if (current_gpucg != dmabuf->gpucg) {
+		ret = -EPERM;
+		goto err;
+	}
+
+	ret = gpucg_try_charge(gpucg, dmabuf->gpucg_dev, dmabuf->size);
+	if (ret)
+		goto err;
+
+	dmabuf->gpucg = gpucg;
+
+	/* uncharge the buffer from the cgroup it's currently charged to. */
+	gpucg_uncharge(current_gpucg, dmabuf->gpucg_dev, dmabuf->size);
+
+err:
+	mutex_unlock(&dmabuf->lock);
+	gpucg_put(current_gpucg);
+	return ret;
+#else
+	return 0;
+#endif /* CONFIG_CGROUP_GPU */
+}
+EXPORT_SYMBOL_NS_GPL(dma_buf_charge_transfer, DMA_BUF);
+
 #ifdef CONFIG_DEBUG_FS
 static int dma_buf_debug_show(struct seq_file *s, void *unused)
 {
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 742f29c3daaf..85c940c08867 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -646,4 +646,6 @@  int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *,
 		 unsigned long);
 int dma_buf_vmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
 void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
+
+int dma_buf_charge_transfer(struct dma_buf *dmabuf, struct gpucg *gpucg);
 #endif /* __DMA_BUF_H__ */