diff mbox series

dmabuf: ensure unique directory name for dmabuf stats

Message ID 1652178212-22383-1-git-send-email-quic_charante@quicinc.com (mailing list archive)
State New, archived
Headers show
Series dmabuf: ensure unique directory name for dmabuf stats | expand

Commit Message

Charan Teja Kalla May 10, 2022, 10:23 a.m. UTC
The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
alloc_anon_inode()) to get an inode number and uses the same as a
directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
used to collect the dmabuf stats and it is created through
dma_buf_stats_setup(). At current, failure to create this directory
entry can make the dma_buf_export() to fail.

Now, as the get_next_ino() can definitely give a repetitive inode no
causing the directory entry creation to fail with -EEXIST. This is a
problem on the systems where dmabuf stats functionality is enabled on
the production builds can make the dma_buf_export(), though the dmabuf
memory is allocated successfully, to fail just because it couldn't
create stats entry.

This issue we are able to see on the snapdragon system within 13 days
where there already exists a directory with inode no "122602" so
dma_buf_stats_setup() failed with -EEXIST as it is trying to create
the same directory entry.

To make the directory entry as unique, append the inode creation time to
the inode. With this change the stats directory entries will be in the
format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
secs>.

Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
---
 drivers/dma-buf/dma-buf-sysfs-stats.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Greg Kroah-Hartman May 10, 2022, 11 a.m. UTC | #1
On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
> The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
> alloc_anon_inode()) to get an inode number and uses the same as a
> directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
> used to collect the dmabuf stats and it is created through
> dma_buf_stats_setup(). At current, failure to create this directory
> entry can make the dma_buf_export() to fail.
> 
> Now, as the get_next_ino() can definitely give a repetitive inode no
> causing the directory entry creation to fail with -EEXIST. This is a
> problem on the systems where dmabuf stats functionality is enabled on
> the production builds can make the dma_buf_export(), though the dmabuf
> memory is allocated successfully, to fail just because it couldn't
> create stats entry.

Then maybe we should not fail the creation path of the kobject fails to
be created?  It's just for debugging, it should be fine if the creation
of it isn't there.

> 
> This issue we are able to see on the snapdragon system within 13 days
> where there already exists a directory with inode no "122602" so
> dma_buf_stats_setup() failed with -EEXIST as it is trying to create
> the same directory entry.
> 
> To make the directory entry as unique, append the inode creation time to
> the inode. With this change the stats directory entries will be in the
> format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
> secs>.

As you are changing the format here, shouldn't the Documentation/ABI/
entry for this also be changed?

And what's to keep the seconds field from also being the same?

thanks,

greg k-h
Christian König May 10, 2022, 11:35 a.m. UTC | #2
Am 10.05.22 um 13:00 schrieb Greg KH:
> On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
>> The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
>> alloc_anon_inode()) to get an inode number and uses the same as a
>> directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
>> used to collect the dmabuf stats and it is created through
>> dma_buf_stats_setup(). At current, failure to create this directory
>> entry can make the dma_buf_export() to fail.
>>
>> Now, as the get_next_ino() can definitely give a repetitive inode no
>> causing the directory entry creation to fail with -EEXIST. This is a
>> problem on the systems where dmabuf stats functionality is enabled on
>> the production builds can make the dma_buf_export(), though the dmabuf
>> memory is allocated successfully, to fail just because it couldn't
>> create stats entry.
> Then maybe we should not fail the creation path of the kobject fails to
> be created?  It's just for debugging, it should be fine if the creation
> of it isn't there.

Well if it's just for debugging then it should be under debugfs and not 
sysfs.

>> This issue we are able to see on the snapdragon system within 13 days
>> where there already exists a directory with inode no "122602" so
>> dma_buf_stats_setup() failed with -EEXIST as it is trying to create
>> the same directory entry.
>>
>> To make the directory entry as unique, append the inode creation time to
>> the inode. With this change the stats directory entries will be in the
>> format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
>> secs>.
> As you are changing the format here, shouldn't the Documentation/ABI/
> entry for this also be changed?

As far as I can see that is even an UAPI break, not sure if we can allow 
that.

> And what's to keep the seconds field from also being the same?

Well exporting two DMA-bufs with the same ino in the same nanosecond 
should be basically impossible, but I would rather opt for using a 64bit 
atomic in that function.

This should be 100% UAPI compatible and even if we manage to create one 
buffer ever ns we need ~500years to wrap around.

Regards,
Christian.

>
> thanks,
>
> greg k-h
Charan Teja Kalla May 10, 2022, 11:55 a.m. UTC | #3
Thanks Greg for the inputs!!

On 5/10/2022 4:30 PM, Greg KH wrote:
>> The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
>> alloc_anon_inode()) to get an inode number and uses the same as a
>> directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
>> used to collect the dmabuf stats and it is created through
>> dma_buf_stats_setup(). At current, failure to create this directory
>> entry can make the dma_buf_export() to fail.
>>
>> Now, as the get_next_ino() can definitely give a repetitive inode no
>> causing the directory entry creation to fail with -EEXIST. This is a
>> problem on the systems where dmabuf stats functionality is enabled on
>> the production builds can make the dma_buf_export(), though the dmabuf
>> memory is allocated successfully, to fail just because it couldn't
>> create stats entry.
> Then maybe we should not fail the creation path of the kobject fails to
> be created?  It's just for debugging, it should be fine if the creation
> of it isn't there.

Not creating the debug node under some special cases can make this
interface not reliable if one wants to know info about the created
dmabuf buffers. Please help in correcting me If my perspective is wrong
here.

IIUC, except this -EEXIST condition, under the other conditions (-EINVAL
and -ENOMEM) failure is fine. Since, we are going to fix the -EEXIST
error in this patch, my opinion is failure in the kobject creation path
is acceptable for the reasons: a) The user is expected to pass the valid
dmabuf to create the stats node, b) The user can undefine the
CONFIG_DMABUF_SYSFS_STATS if he don't want this stats.

> 
>> This issue we are able to see on the snapdragon system within 13 days
>> where there already exists a directory with inode no "122602" so
>> dma_buf_stats_setup() failed with -EEXIST as it is trying to create
>> the same directory entry.
>>
>> To make the directory entry as unique, append the inode creation time to
>> the inode. With this change the stats directory entries will be in the
>> format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
>> secs>.
> As you are changing the format here, shouldn't the Documentation/ABI/
> entry for this also be changed?
> 
> And what's to keep the seconds field from also being the same?

get_next_ino() just increases the inode number monotonically and return
to the caller and it is 'unsigned int' data type. Thus 2 successive
calls always generate the different inode_number but can be the same
secs value.  With inode-secs format, this will be still be a unique
string.  Say it will be like ino1-sec1 and ino2-sec1.

Now after the inode number overflow and wraps, we may get the ino1 again
from the get_next_ino() but then secs will be different i.e. say it may
be like ino1-secn and ion2-secn. So, it always be a unique string.

IOW, with secs field added, to get the same inode-secs string, the uint
should overflow in the same second which is impossible.

Thanks for pointing out the changes to be done in ABI document. Will do
it in the next spin.
Greg Kroah-Hartman May 10, 2022, 12:10 p.m. UTC | #4
On Tue, May 10, 2022 at 01:35:41PM +0200, Christian König wrote:
> Am 10.05.22 um 13:00 schrieb Greg KH:
> > On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
> > > The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
> > > alloc_anon_inode()) to get an inode number and uses the same as a
> > > directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
> > > used to collect the dmabuf stats and it is created through
> > > dma_buf_stats_setup(). At current, failure to create this directory
> > > entry can make the dma_buf_export() to fail.
> > > 
> > > Now, as the get_next_ino() can definitely give a repetitive inode no
> > > causing the directory entry creation to fail with -EEXIST. This is a
> > > problem on the systems where dmabuf stats functionality is enabled on
> > > the production builds can make the dma_buf_export(), though the dmabuf
> > > memory is allocated successfully, to fail just because it couldn't
> > > create stats entry.
> > Then maybe we should not fail the creation path of the kobject fails to
> > be created?  It's just for debugging, it should be fine if the creation
> > of it isn't there.
> 
> Well if it's just for debugging then it should be under debugfs and not
> sysfs.

I'll note that the original patch series for this described why this was
moved from debugfs to sysfs.

> > > This issue we are able to see on the snapdragon system within 13 days
> > > where there already exists a directory with inode no "122602" so
> > > dma_buf_stats_setup() failed with -EEXIST as it is trying to create
> > > the same directory entry.
> > > 
> > > To make the directory entry as unique, append the inode creation time to
> > > the inode. With this change the stats directory entries will be in the
> > > format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
> > > secs>.
> > As you are changing the format here, shouldn't the Documentation/ABI/
> > entry for this also be changed?
> 
> As far as I can see that is even an UAPI break, not sure if we can allow
> that.

Why?  Device names change all the time and should never be static.  A
buffer name should just be a unique identifier in that directory, that's
all.  No rules on the formatting of it unless for some reason the name
being the inode number was somehow being used in userspace for that
number?

thanks,

greg k-h
Charan Teja Kalla May 10, 2022, 12:16 p.m. UTC | #5
Thanks Christian for the inputs!!

On 5/10/2022 5:05 PM, Christian König wrote:
> 
>> And what's to keep the seconds field from also being the same?
> 
> Well exporting two DMA-bufs with the same ino in the same nanosecond
> should be basically impossible, but I would rather opt for using a 64bit
> atomic in that function.
> 
> This should be 100% UAPI compatible and even if we manage to create one
> buffer ever ns we need ~500years to wrap around.

I see that the inode->i_ctime->tv_sec is already defined as
64bit(time64_t tv_sec), hence used it. This way we don't need extra
static atomic_t variable just to get a unique name.

Just pasting excerpt from the reply posted to Greg about why this secs
will always be a unique: with secs field added, to get the same
inode-secs string, the uint should overflow in the same second which is
impossible.

Let me know If you still opt for atomic variable only.
Christian König May 10, 2022, 12:20 p.m. UTC | #6
Am 10.05.22 um 14:16 schrieb Charan Teja Kalla:
> Thanks Christian for the inputs!!
>
> On 5/10/2022 5:05 PM, Christian König wrote:
>>> And what's to keep the seconds field from also being the same?
>> Well exporting two DMA-bufs with the same ino in the same nanosecond
>> should be basically impossible, but I would rather opt for using a 64bit
>> atomic in that function.
>>
>> This should be 100% UAPI compatible and even if we manage to create one
>> buffer ever ns we need ~500years to wrap around.
> I see that the inode->i_ctime->tv_sec is already defined as
> 64bit(time64_t tv_sec), hence used it. This way we don't need extra
> static atomic_t variable just to get a unique name.
>
> Just pasting excerpt from the reply posted to Greg about why this secs
> will always be a unique: with secs field added, to get the same
> inode-secs string, the uint should overflow in the same second which is
> impossible.
>
> Let me know If you still opt for atomic variable only.

I think just using a static atomic variable here would be cleaner, that 
is 100% unique.

Your approach should probably work as well, but it looks quite constructed.

Regards,
Christian.
Christian König May 10, 2022, 12:52 p.m. UTC | #7
Am 10.05.22 um 14:10 schrieb Greg KH:
> On Tue, May 10, 2022 at 01:35:41PM +0200, Christian König wrote:
>> Am 10.05.22 um 13:00 schrieb Greg KH:
>>> On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
>>>> The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
>>>> alloc_anon_inode()) to get an inode number and uses the same as a
>>>> directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
>>>> used to collect the dmabuf stats and it is created through
>>>> dma_buf_stats_setup(). At current, failure to create this directory
>>>> entry can make the dma_buf_export() to fail.
>>>>
>>>> Now, as the get_next_ino() can definitely give a repetitive inode no
>>>> causing the directory entry creation to fail with -EEXIST. This is a
>>>> problem on the systems where dmabuf stats functionality is enabled on
>>>> the production builds can make the dma_buf_export(), though the dmabuf
>>>> memory is allocated successfully, to fail just because it couldn't
>>>> create stats entry.
>>> Then maybe we should not fail the creation path of the kobject fails to
>>> be created?  It's just for debugging, it should be fine if the creation
>>> of it isn't there.
>> Well if it's just for debugging then it should be under debugfs and not
>> sysfs.
> I'll note that the original patch series for this described why this was
> moved from debugfs to sysfs.
>
>>>> This issue we are able to see on the snapdragon system within 13 days
>>>> where there already exists a directory with inode no "122602" so
>>>> dma_buf_stats_setup() failed with -EEXIST as it is trying to create
>>>> the same directory entry.
>>>>
>>>> To make the directory entry as unique, append the inode creation time to
>>>> the inode. With this change the stats directory entries will be in the
>>>> format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
>>>> secs>.
>>> As you are changing the format here, shouldn't the Documentation/ABI/
>>> entry for this also be changed?
>> As far as I can see that is even an UAPI break, not sure if we can allow
>> that.
> Why?  Device names change all the time and should never be static.  A
> buffer name should just be a unique identifier in that directory, that's
> all.  No rules on the formatting of it unless for some reason the name
> being the inode number was somehow being used in userspace for that
> number?

My impression was that we documented that should have been a number, but 
I might be wrong on this. And if it's not documented to be a number, I 
think it should be.

The background is that you probably need to associate the DMA-buf with 
some userspace structure for accounting and that becomes easier when you 
can just put them into a radix.

Regards,
Christian.

>
> thanks,
>
> greg k-h
> _______________________________________________
> Linaro-mm-sig mailing list -- linaro-mm-sig@lists.linaro.org
> To unsubscribe send an email to linaro-mm-sig-leave@lists.linaro.org
diff mbox series

Patch

diff --git a/drivers/dma-buf/dma-buf-sysfs-stats.c b/drivers/dma-buf/dma-buf-sysfs-stats.c
index 2bba0ba..292cb31 100644
--- a/drivers/dma-buf/dma-buf-sysfs-stats.c
+++ b/drivers/dma-buf/dma-buf-sysfs-stats.c
@@ -192,7 +192,8 @@  int dma_buf_stats_setup(struct dma_buf *dmabuf)
 
 	/* create the directory for buffer stats */
 	ret = kobject_init_and_add(&sysfs_entry->kobj, &dma_buf_ktype, NULL,
-				   "%lu", file_inode(dmabuf->file)->i_ino);
+				   "%lu-%lu", file_inode(dmabuf->file)->i_ino,
+				   file_inode(dmabuf->file)->i_ctime.tv_sec);
 	if (ret)
 		goto err_sysfs_dmabuf;