diff mbox series

[1/2] coresight: tmc: Don't change the buffer size if it's in use

Message ID 20241114081653.24328-1-yangyicong@huawei.com (mailing list archive)
State New
Headers show
Series [1/2] coresight: tmc: Don't change the buffer size if it's in use | expand

Commit Message

Yicong Yang Nov. 14, 2024, 8:16 a.m. UTC
From: Yicong Yang <yangyicong@hisilicon.com>

Enable the trace in below steps will crash the kernel by NULL pointer
dereferencing:
echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
echo 1 > /sys/bus/coresight/devices/etm0/enable_source
echo 0x400000 > /sys/bus/coresight/devices/tmc_etr0/buffer_size
echo 1 > /sys/bus/coresight/devices/etm2/enable_source
dd if=/dev/tmc_etr0 of=test_etm_sysfs_etr_030.data

The call trace will be like:
 WARNING: CPU: 39 PID: 8586 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1123 __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
 [...]
 Call trace:
  __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
  tmc_read_prepare_etr+0xc0/0xd0 [coresight_tmc]
  tmc_open+0x60/0xa0 [coresight_tmc]
  misc_open+0x11c/0x170
  chrdev_open+0xcc/0x2b0
  do_dentry_open+0x140/0x4e0
  vfs_open+0x34/0xf8
  path_openat+0x2b0/0xf58
  do_filp_open+0x8c/0x148
  do_sys_openat2+0xb8/0xe8
  __arm64_sys_openat+0x70/0xc0
  el0_svc_common.constprop.0+0x64/0x148
  do_el0_svc+0x24/0x38
  el0_svc+0x40/0x140
  el0t_64_sync_handler+0xc0/0xc8
  el0t_64_sync+0x1a4/0x1a8
 ---[ end trace 0000000000000000 ]---
 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
 [...]
 Call trace:
  tmc_etr_get_sysfs_trace+0x10/0x80 [coresight_tmc]
  vfs_read+0xcc/0x310
  ksys_read+0x74/0x108
  __arm64_sys_read+0x24/0x38
  el0_svc_common.constprop.0+0x64/0x148
  do_el0_svc+0x24/0x38
  el0_svc+0x40/0x140

Due to the buffer size changed, the buffer will be reallocated in
tmc_etr_get_sysfs_buffer() when the second source enabled. At trace
end tmc_etr_sync_sysfs_buf() will reset the drvdata->sysfs_buf and
trigger the later NULL pointer dereference when reading out the
data.

But it doesn't make sense to change the buffer size when it's
already in use. So block such behavior.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/hwtracing/coresight/coresight-tmc-core.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

James Clark Nov. 14, 2024, 10:30 a.m. UTC | #1
On 14/11/2024 8:16 am, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Enable the trace in below steps will crash the kernel by NULL pointer
> dereferencing:
> echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
> echo 1 > /sys/bus/coresight/devices/etm0/enable_source
> echo 0x400000 > /sys/bus/coresight/devices/tmc_etr0/buffer_size
> echo 1 > /sys/bus/coresight/devices/etm2/enable_source
> dd if=/dev/tmc_etr0 of=test_etm_sysfs_etr_030.data
> 
> The call trace will be like:
>   WARNING: CPU: 39 PID: 8586 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1123 __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>   [...]
>   Call trace:
>    __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>    tmc_read_prepare_etr+0xc0/0xd0 [coresight_tmc]
>    tmc_open+0x60/0xa0 [coresight_tmc]
>    misc_open+0x11c/0x170
>    chrdev_open+0xcc/0x2b0
>    do_dentry_open+0x140/0x4e0
>    vfs_open+0x34/0xf8
>    path_openat+0x2b0/0xf58
>    do_filp_open+0x8c/0x148
>    do_sys_openat2+0xb8/0xe8
>    __arm64_sys_openat+0x70/0xc0
>    el0_svc_common.constprop.0+0x64/0x148
>    do_el0_svc+0x24/0x38
>    el0_svc+0x40/0x140
>    el0t_64_sync_handler+0xc0/0xc8
>    el0t_64_sync+0x1a4/0x1a8
>   ---[ end trace 0000000000000000 ]---
>   Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
>   [...]
>   Call trace:
>    tmc_etr_get_sysfs_trace+0x10/0x80 [coresight_tmc]
>    vfs_read+0xcc/0x310
>    ksys_read+0x74/0x108
>    __arm64_sys_read+0x24/0x38
>    el0_svc_common.constprop.0+0x64/0x148
>    do_el0_svc+0x24/0x38
>    el0_svc+0x40/0x140
> 
> Due to the buffer size changed, the buffer will be reallocated in
> tmc_etr_get_sysfs_buffer() when the second source enabled. At trace
> end tmc_etr_sync_sysfs_buf() will reset the drvdata->sysfs_buf and
> trigger the later NULL pointer dereference when reading out the
> data.
> 
> But it doesn't make sense to change the buffer size when it's
> already in use. So block such behavior.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>   drivers/hwtracing/coresight/coresight-tmc-core.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
> index 475fa4bb6813..9660af63e9bc 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
> @@ -319,6 +319,11 @@ static ssize_t buffer_size_store(struct device *dev,
>   	if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
>   		return -EPERM;
>   
> +	/* Don't change the buffer size if it's in use */
> +	guard(spinlock)(&drvdata->spinlock);
> +	if (coresight_get_mode(drvdata->csdev) != CS_MODE_DISABLED)

Size isn't used in perf mode is it? So it can be -EBUSY only when mode 
== CS_MODE_SYSFS.

> +		return -EBUSY;
> +
>   	ret = kstrtoul(buf, 0, &val);
>   	if (ret)
>   		return ret;

Looks ok to me. Although for consistency it might be worth changing to 
guard(mutex)(&coresight_mutex) because this is about sysfs mode only and 
other usages of mode and comments point to coresight_mutex. Using the 
device's spinlock will technically work but it did make me go and double 
check the code. And there are other cases of reading the mode like this:

static ssize_t enable_source_show(struct device *dev,
				  struct device_attribute *attr,
				  char *buf)
{
	struct coresight_device *csdev = to_coresight_device(dev);

	guard(mutex)(&coresight_mutex);
	return scnprintf(buf, PAGE_SIZE, "%u\n",
			 coresight_get_mode(csdev) == CS_MODE_SYSFS);
}

Mode can change to CS_MODE_PERF while inside coresight_mutex but the 
device would end up not being enabled for sysfs, so it's still ok to 
update the sysfs size value in that case.

With that change:

Reviewed-by: James Clark <james.clark@linaro.org>
Yicong Yang Nov. 14, 2024, 2:51 p.m. UTC | #2
On 2024/11/14 18:30, James Clark wrote:
> 
> 
> On 14/11/2024 8:16 am, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> Enable the trace in below steps will crash the kernel by NULL pointer
>> dereferencing:
>> echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
>> echo 1 > /sys/bus/coresight/devices/etm0/enable_source
>> echo 0x400000 > /sys/bus/coresight/devices/tmc_etr0/buffer_size
>> echo 1 > /sys/bus/coresight/devices/etm2/enable_source
>> dd if=/dev/tmc_etr0 of=test_etm_sysfs_etr_030.data
>>
>> The call trace will be like:
>>   WARNING: CPU: 39 PID: 8586 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1123 __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>   [...]
>>   Call trace:
>>    __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>    tmc_read_prepare_etr+0xc0/0xd0 [coresight_tmc]
>>    tmc_open+0x60/0xa0 [coresight_tmc]
>>    misc_open+0x11c/0x170
>>    chrdev_open+0xcc/0x2b0
>>    do_dentry_open+0x140/0x4e0
>>    vfs_open+0x34/0xf8
>>    path_openat+0x2b0/0xf58
>>    do_filp_open+0x8c/0x148
>>    do_sys_openat2+0xb8/0xe8
>>    __arm64_sys_openat+0x70/0xc0
>>    el0_svc_common.constprop.0+0x64/0x148
>>    do_el0_svc+0x24/0x38
>>    el0_svc+0x40/0x140
>>    el0t_64_sync_handler+0xc0/0xc8
>>    el0t_64_sync+0x1a4/0x1a8
>>   ---[ end trace 0000000000000000 ]---
>>   Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
>>   [...]
>>   Call trace:
>>    tmc_etr_get_sysfs_trace+0x10/0x80 [coresight_tmc]
>>    vfs_read+0xcc/0x310
>>    ksys_read+0x74/0x108
>>    __arm64_sys_read+0x24/0x38
>>    el0_svc_common.constprop.0+0x64/0x148
>>    do_el0_svc+0x24/0x38
>>    el0_svc+0x40/0x140
>>
>> Due to the buffer size changed, the buffer will be reallocated in
>> tmc_etr_get_sysfs_buffer() when the second source enabled. At trace
>> end tmc_etr_sync_sysfs_buf() will reset the drvdata->sysfs_buf and
>> trigger the later NULL pointer dereference when reading out the
>> data.
>>
>> But it doesn't make sense to change the buffer size when it's
>> already in use. So block such behavior.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-tmc-core.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
>> index 475fa4bb6813..9660af63e9bc 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>> @@ -319,6 +319,11 @@ static ssize_t buffer_size_store(struct device *dev,
>>       if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
>>           return -EPERM;
>>   +    /* Don't change the buffer size if it's in use */
>> +    guard(spinlock)(&drvdata->spinlock);
>> +    if (coresight_get_mode(drvdata->csdev) != CS_MODE_DISABLED)
> 
> Size isn't used in perf mode is it? So it can be -EBUSY only when mode == CS_MODE_SYSFS.
> 

alloc_etr_buf() on the perf path will read drvdata->size, not sure it matters if user
change it through sysfs in the meanwhile. Will test and have a check if there are any
other places using size on the perf path.

>> +        return -EBUSY;
>> +
>>       ret = kstrtoul(buf, 0, &val);
>>       if (ret)
>>           return ret;
> 
> Looks ok to me. Although for consistency it might be worth changing to guard(mutex)(&coresight_mutex) because this is about sysfs mode only and other usages of mode and comments point to coresight_mutex. Using the device's spinlock will technically work but it did make me go and double check the code. And there are other cases of reading the mode like this:
> 

ok, I thought to also serialize the use of drvdata->size. But as you mentioned
use coresight_mutex is enough and will be consistenct with other places.

> static ssize_t enable_source_show(struct device *dev,
>                   struct device_attribute *attr,
>                   char *buf)
> {
>     struct coresight_device *csdev = to_coresight_device(dev);
> 
>     guard(mutex)(&coresight_mutex);
>     return scnprintf(buf, PAGE_SIZE, "%u\n",
>              coresight_get_mode(csdev) == CS_MODE_SYSFS);
> }
> 
> Mode can change to CS_MODE_PERF while inside coresight_mutex but the device would end up not being enabled for sysfs, so it's still ok to update the sysfs size value in that case.
> 
> With that change:
> 
> Reviewed-by: James Clark <james.clark@linaro.org>

Thanks.
James Clark Nov. 14, 2024, 3:26 p.m. UTC | #3
On 14/11/2024 2:51 pm, Yicong Yang wrote:
> On 2024/11/14 18:30, James Clark wrote:
>>
>>
>> On 14/11/2024 8:16 am, Yicong Yang wrote:
>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>
>>> Enable the trace in below steps will crash the kernel by NULL pointer
>>> dereferencing:
>>> echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
>>> echo 1 > /sys/bus/coresight/devices/etm0/enable_source
>>> echo 0x400000 > /sys/bus/coresight/devices/tmc_etr0/buffer_size
>>> echo 1 > /sys/bus/coresight/devices/etm2/enable_source
>>> dd if=/dev/tmc_etr0 of=test_etm_sysfs_etr_030.data
>>>
>>> The call trace will be like:
>>>    WARNING: CPU: 39 PID: 8586 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1123 __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>>    [...]
>>>    Call trace:
>>>     __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>>     tmc_read_prepare_etr+0xc0/0xd0 [coresight_tmc]
>>>     tmc_open+0x60/0xa0 [coresight_tmc]
>>>     misc_open+0x11c/0x170
>>>     chrdev_open+0xcc/0x2b0
>>>     do_dentry_open+0x140/0x4e0
>>>     vfs_open+0x34/0xf8
>>>     path_openat+0x2b0/0xf58
>>>     do_filp_open+0x8c/0x148
>>>     do_sys_openat2+0xb8/0xe8
>>>     __arm64_sys_openat+0x70/0xc0
>>>     el0_svc_common.constprop.0+0x64/0x148
>>>     do_el0_svc+0x24/0x38
>>>     el0_svc+0x40/0x140
>>>     el0t_64_sync_handler+0xc0/0xc8
>>>     el0t_64_sync+0x1a4/0x1a8
>>>    ---[ end trace 0000000000000000 ]---
>>>    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
>>>    [...]
>>>    Call trace:
>>>     tmc_etr_get_sysfs_trace+0x10/0x80 [coresight_tmc]
>>>     vfs_read+0xcc/0x310
>>>     ksys_read+0x74/0x108
>>>     __arm64_sys_read+0x24/0x38
>>>     el0_svc_common.constprop.0+0x64/0x148
>>>     do_el0_svc+0x24/0x38
>>>     el0_svc+0x40/0x140
>>>
>>> Due to the buffer size changed, the buffer will be reallocated in
>>> tmc_etr_get_sysfs_buffer() when the second source enabled. At trace
>>> end tmc_etr_sync_sysfs_buf() will reset the drvdata->sysfs_buf and
>>> trigger the later NULL pointer dereference when reading out the
>>> data.
>>>
>>> But it doesn't make sense to change the buffer size when it's
>>> already in use. So block such behavior.
>>>
>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>> ---
>>>    drivers/hwtracing/coresight/coresight-tmc-core.c | 5 +++++
>>>    1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> index 475fa4bb6813..9660af63e9bc 100644
>>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> @@ -319,6 +319,11 @@ static ssize_t buffer_size_store(struct device *dev,
>>>        if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
>>>            return -EPERM;
>>>    +    /* Don't change the buffer size if it's in use */
>>> +    guard(spinlock)(&drvdata->spinlock);
>>> +    if (coresight_get_mode(drvdata->csdev) != CS_MODE_DISABLED)
>>
>> Size isn't used in perf mode is it? So it can be -EBUSY only when mode == CS_MODE_SYSFS.
>>
> 
> alloc_etr_buf() on the perf path will read drvdata->size, not sure it matters if user
> change it through sysfs in the meanwhile. Will test and have a check if there are any
> other places using size on the perf path.
> 

Hmmm I assumed that Perf mode completely ignored anything from sysfs 
mode. I see that alloc_etr_buf() does sometimes use the sysfs value. I 
don't really see why that's necessary because that means it sometimes 
ignores the buffer size from the perf command line depending on what's 
in sysfs, but the modes should be mutually exclusive.

Unless we fix that then I think you do need to use the device spinlock. 
But I think we should tidy up alloc_etr_buf() to only try to allocate 
from the Perf size down to TMC_ETR_PERF_MIN_BUF_SIZE, ignoring 
drvdata->size. Then the behavior is less surprising to users and also 
anyone reading the code. And rename it to alloc_etr_buf_perf().

Unless Suzuki knows of a reason it was done that way to begin with? I 
checked the commit message but it just says that it was like that but 
not why.

>>> +        return -EBUSY;
>>> +
>>>        ret = kstrtoul(buf, 0, &val);
>>>        if (ret)
>>>            return ret;
>>
>> Looks ok to me. Although for consistency it might be worth changing to guard(mutex)(&coresight_mutex) because this is about sysfs mode only and other usages of mode and comments point to coresight_mutex. Using the device's spinlock will technically work but it did make me go and double check the code. And there are other cases of reading the mode like this:
>>
> 
> ok, I thought to also serialize the use of drvdata->size. But as you mentioned
> use coresight_mutex is enough and will be consistenct with other places.
> 
>> static ssize_t enable_source_show(struct device *dev,
>>                    struct device_attribute *attr,
>>                    char *buf)
>> {
>>      struct coresight_device *csdev = to_coresight_device(dev);
>>
>>      guard(mutex)(&coresight_mutex);
>>      return scnprintf(buf, PAGE_SIZE, "%u\n",
>>               coresight_get_mode(csdev) == CS_MODE_SYSFS);
>> }
>>
>> Mode can change to CS_MODE_PERF while inside coresight_mutex but the device would end up not being enabled for sysfs, so it's still ok to update the sysfs size value in that case.
>>
>> With that change:
>>
>> Reviewed-by: James Clark <james.clark@linaro.org>
> 
> Thanks.
>
Suzuki K Poulose Nov. 14, 2024, 5:20 p.m. UTC | #4
Hi,

Thanks for the report, see my comments inline.

On 14/11/2024 15:26, James Clark wrote:
> 
> 
> On 14/11/2024 2:51 pm, Yicong Yang wrote:
>> On 2024/11/14 18:30, James Clark wrote:
>>>
>>>
>>> On 14/11/2024 8:16 am, Yicong Yang wrote:
>>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>>
>>>> Enable the trace in below steps will crash the kernel by NULL pointer
>>>> dereferencing:
>>>> echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
>>>> echo 1 > /sys/bus/coresight/devices/etm0/enable_source
>>>> echo 0x400000 > /sys/bus/coresight/devices/tmc_etr0/buffer_size
>>>> echo 1 > /sys/bus/coresight/devices/etm2/enable_source
>>>> dd if=/dev/tmc_etr0 of=test_etm_sysfs_etr_030.data
>>>>
>>>> The call trace will be like:
>>>>    WARNING: CPU: 39 PID: 8586 at drivers/hwtracing/coresight/ 
>>>> coresight-tmc-etr.c:1123 __tmc_etr_disable_hw+0x108/0x140 
>>>> [coresight_tmc]
>>>>    [...]
>>>>    Call trace:
>>>>     __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>>>     tmc_read_prepare_etr+0xc0/0xd0 [coresight_tmc]
>>>>     tmc_open+0x60/0xa0 [coresight_tmc]
>>>>     misc_open+0x11c/0x170
>>>>     chrdev_open+0xcc/0x2b0
>>>>     do_dentry_open+0x140/0x4e0
>>>>     vfs_open+0x34/0xf8
>>>>     path_openat+0x2b0/0xf58
>>>>     do_filp_open+0x8c/0x148
>>>>     do_sys_openat2+0xb8/0xe8
>>>>     __arm64_sys_openat+0x70/0xc0
>>>>     el0_svc_common.constprop.0+0x64/0x148
>>>>     do_el0_svc+0x24/0x38
>>>>     el0_svc+0x40/0x140
>>>>     el0t_64_sync_handler+0xc0/0xc8
>>>>     el0t_64_sync+0x1a4/0x1a8
>>>>    ---[ end trace 0000000000000000 ]---
>>>>    Unable to handle kernel NULL pointer dereference at virtual 
>>>> address 0000000000000028
>>>>    [...]
>>>>    Call trace:
>>>>     tmc_etr_get_sysfs_trace+0x10/0x80 [coresight_tmc]
>>>>     vfs_read+0xcc/0x310
>>>>     ksys_read+0x74/0x108
>>>>     __arm64_sys_read+0x24/0x38
>>>>     el0_svc_common.constprop.0+0x64/0x148
>>>>     do_el0_svc+0x24/0x38
>>>>     el0_svc+0x40/0x140
>>>>
>>>> Due to the buffer size changed, the buffer will be reallocated in
>>>> tmc_etr_get_sysfs_buffer() when the second source enabled. At trace
>>>> end tmc_etr_sync_sysfs_buf() will reset the drvdata->sysfs_buf and
>>>> trigger the later NULL pointer dereference when reading out the
>>>> data.
>>>>
>>>> But it doesn't make sense to change the buffer size when it's
>>>> already in use. So block such behavior.
>>>>
>>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>>> ---
>>>>    drivers/hwtracing/coresight/coresight-tmc-core.c | 5 +++++
>>>>    1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/ 
>>>> drivers/hwtracing/coresight/coresight-tmc-core.c
>>>> index 475fa4bb6813..9660af63e9bc 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>> @@ -319,6 +319,11 @@ static ssize_t buffer_size_store(struct device 
>>>> *dev,
>>>>        if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
>>>>            return -EPERM;
>>>>    +    /* Don't change the buffer size if it's in use */
>>>> +    guard(spinlock)(&drvdata->spinlock);
>>>> +    if (coresight_get_mode(drvdata->csdev) != CS_MODE_DISABLED)

Could we do something like this below ?

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c 
b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index a48bb85d0e7f..863a645fa88a 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1178,7 +1178,9 @@ static struct etr_buf 
*tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
          */
         spin_lock_irqsave(&drvdata->spinlock, flags);
         sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
-       if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) {
+       if (!sysfs_buf ||
+           ((sysfs_buf->size != drvdata->size) &&
+             coresight_get_mode(csdev) != CS_MODE_SYSFS))
                 spin_unlock_irqrestore(&drvdata->spinlock, flags);

                 /* Allocate memory with the locks released */

i.e., do not allocate a new buffer if the sysfs mode is active. The new
size can be set when the new session starts


>>>
>>> Size isn't used in perf mode is it? So it can be -EBUSY only when 
>>> mode == CS_MODE_SYSFS.
>>>
>>
>> alloc_etr_buf() on the perf path will read drvdata->size, not sure it 
>> matters if user
>> change it through sysfs in the meanwhile. Will test and have a check 
>> if there are any
>> other places using size on the perf path.

That was there to make sure the user can allocate a bigger buffer (of
the AUX size vs sysfs configured size) and possibly collect more trace
(i.e., in multiple aux buffers). But looks like that is not useful,
given we can only ever collect to one AUX (the last one turning ETR off).

So we could remove that check.

Suzuki


>>
> 
> Hmmm I assumed that Perf mode completely ignored anything from sysfs 
> mode. I see that alloc_etr_buf() does sometimes use the sysfs value. I 
> don't really see why that's necessary because that means it sometimes 
> ignores the buffer size from the perf command line depending on what's 
> in sysfs, but the modes should be mutually exclusive.
> 
> Unless we fix that then I think you do need to use the device spinlock. 
> But I think we should tidy up alloc_etr_buf() to only try to allocate 
> from the Perf size down to TMC_ETR_PERF_MIN_BUF_SIZE, ignoring drvdata- 
>  >size. Then the behavior is less surprising to users and also anyone 
> reading the code. And rename it to alloc_etr_buf_perf().
> 
> Unless Suzuki knows of a reason it was done that way to begin with? I 
> checked the commit message but it just says that it was like that but 
> not why.
> 
>>>> +        return -EBUSY;
>>>> +
>>>>        ret = kstrtoul(buf, 0, &val);
>>>>        if (ret)
>>>>            return ret;
>>>
>>> Looks ok to me. Although for consistency it might be worth changing 
>>> to guard(mutex)(&coresight_mutex) because this is about sysfs mode 
>>> only and other usages of mode and comments point to coresight_mutex. 
>>> Using the device's spinlock will technically work but it did make me 
>>> go and double check the code. And there are other cases of reading 
>>> the mode like this:
>>>
>>
>> ok, I thought to also serialize the use of drvdata->size. But as you 
>> mentioned
>> use coresight_mutex is enough and will be consistenct with other places.
>>
>>> static ssize_t enable_source_show(struct device *dev,
>>>                    struct device_attribute *attr,
>>>                    char *buf)
>>> {
>>>      struct coresight_device *csdev = to_coresight_device(dev);
>>>
>>>      guard(mutex)(&coresight_mutex);
>>>      return scnprintf(buf, PAGE_SIZE, "%u\n",
>>>               coresight_get_mode(csdev) == CS_MODE_SYSFS);
>>> }
>>>
>>> Mode can change to CS_MODE_PERF while inside coresight_mutex but the 
>>> device would end up not being enabled for sysfs, so it's still ok to 
>>> update the sysfs size value in that case.
>>>
>>> With that change:
>>>
>>> Reviewed-by: James Clark <james.clark@linaro.org>
>>
>> Thanks.
>>
>
Yicong Yang Nov. 19, 2024, 12:40 p.m. UTC | #5
On 2024/11/15 1:20, Suzuki K Poulose wrote:
> Hi,
> 
> Thanks for the report, see my comments inline.
> 
> On 14/11/2024 15:26, James Clark wrote:
>>
>>
>> On 14/11/2024 2:51 pm, Yicong Yang wrote:
>>> On 2024/11/14 18:30, James Clark wrote:
>>>>
>>>>
>>>> On 14/11/2024 8:16 am, Yicong Yang wrote:
>>>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>>>
>>>>> Enable the trace in below steps will crash the kernel by NULL pointer
>>>>> dereferencing:
>>>>> echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
>>>>> echo 1 > /sys/bus/coresight/devices/etm0/enable_source
>>>>> echo 0x400000 > /sys/bus/coresight/devices/tmc_etr0/buffer_size
>>>>> echo 1 > /sys/bus/coresight/devices/etm2/enable_source
>>>>> dd if=/dev/tmc_etr0 of=test_etm_sysfs_etr_030.data
>>>>>
>>>>> The call trace will be like:
>>>>>    WARNING: CPU: 39 PID: 8586 at drivers/hwtracing/coresight/ coresight-tmc-etr.c:1123 __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>>>>    [...]
>>>>>    Call trace:
>>>>>     __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>>>>     tmc_read_prepare_etr+0xc0/0xd0 [coresight_tmc]
>>>>>     tmc_open+0x60/0xa0 [coresight_tmc]
>>>>>     misc_open+0x11c/0x170
>>>>>     chrdev_open+0xcc/0x2b0
>>>>>     do_dentry_open+0x140/0x4e0
>>>>>     vfs_open+0x34/0xf8
>>>>>     path_openat+0x2b0/0xf58
>>>>>     do_filp_open+0x8c/0x148
>>>>>     do_sys_openat2+0xb8/0xe8
>>>>>     __arm64_sys_openat+0x70/0xc0
>>>>>     el0_svc_common.constprop.0+0x64/0x148
>>>>>     do_el0_svc+0x24/0x38
>>>>>     el0_svc+0x40/0x140
>>>>>     el0t_64_sync_handler+0xc0/0xc8
>>>>>     el0t_64_sync+0x1a4/0x1a8
>>>>>    ---[ end trace 0000000000000000 ]---
>>>>>    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
>>>>>    [...]
>>>>>    Call trace:
>>>>>     tmc_etr_get_sysfs_trace+0x10/0x80 [coresight_tmc]
>>>>>     vfs_read+0xcc/0x310
>>>>>     ksys_read+0x74/0x108
>>>>>     __arm64_sys_read+0x24/0x38
>>>>>     el0_svc_common.constprop.0+0x64/0x148
>>>>>     do_el0_svc+0x24/0x38
>>>>>     el0_svc+0x40/0x140
>>>>>
>>>>> Due to the buffer size changed, the buffer will be reallocated in
>>>>> tmc_etr_get_sysfs_buffer() when the second source enabled. At trace
>>>>> end tmc_etr_sync_sysfs_buf() will reset the drvdata->sysfs_buf and
>>>>> trigger the later NULL pointer dereference when reading out the
>>>>> data.
>>>>>
>>>>> But it doesn't make sense to change the buffer size when it's
>>>>> already in use. So block such behavior.
>>>>>
>>>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>>>> ---
>>>>>    drivers/hwtracing/coresight/coresight-tmc-core.c | 5 +++++
>>>>>    1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/ drivers/hwtracing/coresight/coresight-tmc-core.c
>>>>> index 475fa4bb6813..9660af63e9bc 100644
>>>>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>>> @@ -319,6 +319,11 @@ static ssize_t buffer_size_store(struct device *dev,
>>>>>        if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
>>>>>            return -EPERM;
>>>>>    +    /* Don't change the buffer size if it's in use */
>>>>> +    guard(spinlock)(&drvdata->spinlock);
>>>>> +    if (coresight_get_mode(drvdata->csdev) != CS_MODE_DISABLED)
> 
> Could we do something like this below ?
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index a48bb85d0e7f..863a645fa88a 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -1178,7 +1178,9 @@ static struct etr_buf *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>          */
>         spin_lock_irqsave(&drvdata->spinlock, flags);
>         sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
> -       if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) {
> +       if (!sysfs_buf ||
> +           ((sysfs_buf->size != drvdata->size) &&
> +             coresight_get_mode(csdev) != CS_MODE_SYSFS))
>                 spin_unlock_irqrestore(&drvdata->spinlock, flags);
> 
>                 /* Allocate memory with the locks released */
> 
> i.e., do not allocate a new buffer if the sysfs mode is active. The new
> size can be set when the new session starts
> 

tested with steps in the commit and perf (below) simultaneously and don't see the
problem mentioned.
perf record -e cs_etm// -C 0 -- sleep 1 2>&1 > /dev/null

It's a bit confusing with this fix since we actually discard/delay the user's request
of changing the buffer size but no error/information returned to user. If this is not
a problem the fix is fine for me.

Thanks.

> 
>>>>
>>>> Size isn't used in perf mode is it? So it can be -EBUSY only when mode == CS_MODE_SYSFS.
>>>>
>>>
>>> alloc_etr_buf() on the perf path will read drvdata->size, not sure it matters if user
>>> change it through sysfs in the meanwhile. Will test and have a check if there are any
>>> other places using size on the perf path.
> 
> That was there to make sure the user can allocate a bigger buffer (of
> the AUX size vs sysfs configured size) and possibly collect more trace
> (i.e., in multiple aux buffers). But looks like that is not useful,
> given we can only ever collect to one AUX (the last one turning ETR off).
> 
> So we could remove that check.
> 
> Suzuki
> 
> 
>>>
>>
>> Hmmm I assumed that Perf mode completely ignored anything from sysfs mode. I see that alloc_etr_buf() does sometimes use the sysfs value. I don't really see why that's necessary because that means it sometimes ignores the buffer size from the perf command line depending on what's in sysfs, but the modes should be mutually exclusive.
>>
>> Unless we fix that then I think you do need to use the device spinlock. But I think we should tidy up alloc_etr_buf() to only try to allocate from the Perf size down to TMC_ETR_PERF_MIN_BUF_SIZE, ignoring drvdata-  >size. Then the behavior is less surprising to users and also anyone reading the code. And rename it to alloc_etr_buf_perf().
>>
>> Unless Suzuki knows of a reason it was done that way to begin with? I checked the commit message but it just says that it was like that but not why.
>>
>>>>> +        return -EBUSY;
>>>>> +
>>>>>        ret = kstrtoul(buf, 0, &val);
>>>>>        if (ret)
>>>>>            return ret;
>>>>
>>>> Looks ok to me. Although for consistency it might be worth changing to guard(mutex)(&coresight_mutex) because this is about sysfs mode only and other usages of mode and comments point to coresight_mutex. Using the device's spinlock will technically work but it did make me go and double check the code. And there are other cases of reading the mode like this:
>>>>
>>>
>>> ok, I thought to also serialize the use of drvdata->size. But as you mentioned
>>> use coresight_mutex is enough and will be consistenct with other places.
>>>
>>>> static ssize_t enable_source_show(struct device *dev,
>>>>                    struct device_attribute *attr,
>>>>                    char *buf)
>>>> {
>>>>      struct coresight_device *csdev = to_coresight_device(dev);
>>>>
>>>>      guard(mutex)(&coresight_mutex);
>>>>      return scnprintf(buf, PAGE_SIZE, "%u\n",
>>>>               coresight_get_mode(csdev) == CS_MODE_SYSFS);
>>>> }
>>>>
>>>> Mode can change to CS_MODE_PERF while inside coresight_mutex but the device would end up not being enabled for sysfs, so it's still ok to update the sysfs size value in that case.
>>>>
>>>> With that change:
>>>>
>>>> Reviewed-by: James Clark <james.clark@linaro.org>
>>>
>>> Thanks.
>>>
>>
> 
> 
> .
Suzuki K Poulose Nov. 19, 2024, 1:52 p.m. UTC | #6
Hi

On 19/11/2024 12:40, Yicong Yang wrote:
> On 2024/11/15 1:20, Suzuki K Poulose wrote:
>> Hi,
>>
>> Thanks for the report, see my comments inline.
>>
>> On 14/11/2024 15:26, James Clark wrote:
>>>
>>>
>>> On 14/11/2024 2:51 pm, Yicong Yang wrote:
>>>> On 2024/11/14 18:30, James Clark wrote:
>>>>>
>>>>>
>>>>> On 14/11/2024 8:16 am, Yicong Yang wrote:
>>>>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>>>>
>>>>>> Enable the trace in below steps will crash the kernel by NULL pointer
>>>>>> dereferencing:
>>>>>> echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
>>>>>> echo 1 > /sys/bus/coresight/devices/etm0/enable_source
>>>>>> echo 0x400000 > /sys/bus/coresight/devices/tmc_etr0/buffer_size
>>>>>> echo 1 > /sys/bus/coresight/devices/etm2/enable_source
>>>>>> dd if=/dev/tmc_etr0 of=test_etm_sysfs_etr_030.data
>>>>>>
>>>>>> The call trace will be like:
>>>>>>     WARNING: CPU: 39 PID: 8586 at drivers/hwtracing/coresight/ coresight-tmc-etr.c:1123 __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>>>>>     [...]
>>>>>>     Call trace:
>>>>>>      __tmc_etr_disable_hw+0x108/0x140 [coresight_tmc]
>>>>>>      tmc_read_prepare_etr+0xc0/0xd0 [coresight_tmc]
>>>>>>      tmc_open+0x60/0xa0 [coresight_tmc]
>>>>>>      misc_open+0x11c/0x170
>>>>>>      chrdev_open+0xcc/0x2b0
>>>>>>      do_dentry_open+0x140/0x4e0
>>>>>>      vfs_open+0x34/0xf8
>>>>>>      path_openat+0x2b0/0xf58
>>>>>>      do_filp_open+0x8c/0x148
>>>>>>      do_sys_openat2+0xb8/0xe8
>>>>>>      __arm64_sys_openat+0x70/0xc0
>>>>>>      el0_svc_common.constprop.0+0x64/0x148
>>>>>>      do_el0_svc+0x24/0x38
>>>>>>      el0_svc+0x40/0x140
>>>>>>      el0t_64_sync_handler+0xc0/0xc8
>>>>>>      el0t_64_sync+0x1a4/0x1a8
>>>>>>     ---[ end trace 0000000000000000 ]---
>>>>>>     Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
>>>>>>     [...]
>>>>>>     Call trace:
>>>>>>      tmc_etr_get_sysfs_trace+0x10/0x80 [coresight_tmc]
>>>>>>      vfs_read+0xcc/0x310
>>>>>>      ksys_read+0x74/0x108
>>>>>>      __arm64_sys_read+0x24/0x38
>>>>>>      el0_svc_common.constprop.0+0x64/0x148
>>>>>>      do_el0_svc+0x24/0x38
>>>>>>      el0_svc+0x40/0x140
>>>>>>
>>>>>> Due to the buffer size changed, the buffer will be reallocated in
>>>>>> tmc_etr_get_sysfs_buffer() when the second source enabled. At trace
>>>>>> end tmc_etr_sync_sysfs_buf() will reset the drvdata->sysfs_buf and
>>>>>> trigger the later NULL pointer dereference when reading out the
>>>>>> data.
>>>>>>
>>>>>> But it doesn't make sense to change the buffer size when it's
>>>>>> already in use. So block such behavior.
>>>>>>
>>>>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>>>>> ---
>>>>>>     drivers/hwtracing/coresight/coresight-tmc-core.c | 5 +++++
>>>>>>     1 file changed, 5 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/ drivers/hwtracing/coresight/coresight-tmc-core.c
>>>>>> index 475fa4bb6813..9660af63e9bc 100644
>>>>>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>>>> @@ -319,6 +319,11 @@ static ssize_t buffer_size_store(struct device *dev,
>>>>>>         if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
>>>>>>             return -EPERM;
>>>>>>     +    /* Don't change the buffer size if it's in use */
>>>>>> +    guard(spinlock)(&drvdata->spinlock);
>>>>>> +    if (coresight_get_mode(drvdata->csdev) != CS_MODE_DISABLED)
>>
>> Could we do something like this below ?
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> index a48bb85d0e7f..863a645fa88a 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> @@ -1178,7 +1178,9 @@ static struct etr_buf *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>           */
>>          spin_lock_irqsave(&drvdata->spinlock, flags);
>>          sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
>> -       if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) {
>> +       if (!sysfs_buf ||
>> +           ((sysfs_buf->size != drvdata->size) &&
>> +             coresight_get_mode(csdev) != CS_MODE_SYSFS))
>>                  spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>
>>                  /* Allocate memory with the locks released */
>>
>> i.e., do not allocate a new buffer if the sysfs mode is active. The new
>> size can be set when the new session starts
>>
> 
> tested with steps in the commit and perf (below) simultaneously and don't see the
> problem mentioned.
> perf record -e cs_etm// -C 0 -- sleep 1 2>&1 > /dev/null

Thanks for the testing !
> 
> It's a bit confusing with this fix since we actually discard/delay the user's request
> of changing the buffer size but no error/information returned to user. If this is not
> a problem the fix is fine for me.

We do not discard the users request. Also, for all practical purposes
there is no "delay" in the new buffer size usage, since the existing
session (of the "sink") cannot change the buffer size while it is
active. It will only be effective when a new session starts, which is
the case with this patch.

Suzuki


> 
> Thanks.
> 
>>
>>>>>
>>>>> Size isn't used in perf mode is it? So it can be -EBUSY only when mode == CS_MODE_SYSFS.
>>>>>
>>>>
>>>> alloc_etr_buf() on the perf path will read drvdata->size, not sure it matters if user
>>>> change it through sysfs in the meanwhile. Will test and have a check if there are any
>>>> other places using size on the perf path.
>>
>> That was there to make sure the user can allocate a bigger buffer (of
>> the AUX size vs sysfs configured size) and possibly collect more trace
>> (i.e., in multiple aux buffers). But looks like that is not useful,
>> given we can only ever collect to one AUX (the last one turning ETR off).
>>
>> So we could remove that check.
>>
>> Suzuki
>>
>>
>>>>
>>>
>>> Hmmm I assumed that Perf mode completely ignored anything from sysfs mode. I see that alloc_etr_buf() does sometimes use the sysfs value. I don't really see why that's necessary because that means it sometimes ignores the buffer size from the perf command line depending on what's in sysfs, but the modes should be mutually exclusive.
>>>
>>> Unless we fix that then I think you do need to use the device spinlock. But I think we should tidy up alloc_etr_buf() to only try to allocate from the Perf size down to TMC_ETR_PERF_MIN_BUF_SIZE, ignoring drvdata-  >size. Then the behavior is less surprising to users and also anyone reading the code. And rename it to alloc_etr_buf_perf().
>>>
>>> Unless Suzuki knows of a reason it was done that way to begin with? I checked the commit message but it just says that it was like that but not why.
>>>
>>>>>> +        return -EBUSY;
>>>>>> +
>>>>>>         ret = kstrtoul(buf, 0, &val);
>>>>>>         if (ret)
>>>>>>             return ret;
>>>>>
>>>>> Looks ok to me. Although for consistency it might be worth changing to guard(mutex)(&coresight_mutex) because this is about sysfs mode only and other usages of mode and comments point to coresight_mutex. Using the device's spinlock will technically work but it did make me go and double check the code. And there are other cases of reading the mode like this:
>>>>>
>>>>
>>>> ok, I thought to also serialize the use of drvdata->size. But as you mentioned
>>>> use coresight_mutex is enough and will be consistenct with other places.
>>>>
>>>>> static ssize_t enable_source_show(struct device *dev,
>>>>>                     struct device_attribute *attr,
>>>>>                     char *buf)
>>>>> {
>>>>>       struct coresight_device *csdev = to_coresight_device(dev);
>>>>>
>>>>>       guard(mutex)(&coresight_mutex);
>>>>>       return scnprintf(buf, PAGE_SIZE, "%u\n",
>>>>>                coresight_get_mode(csdev) == CS_MODE_SYSFS);
>>>>> }
>>>>>
>>>>> Mode can change to CS_MODE_PERF while inside coresight_mutex but the device would end up not being enabled for sysfs, so it's still ok to update the sysfs size value in that case.
>>>>>
>>>>> With that change:
>>>>>
>>>>> Reviewed-by: James Clark <james.clark@linaro.org>
>>>>
>>>> Thanks.
>>>>
>>>
>>
>>
>> .
diff mbox series

Patch

diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
index 475fa4bb6813..9660af63e9bc 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-core.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
@@ -319,6 +319,11 @@  static ssize_t buffer_size_store(struct device *dev,
 	if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
 		return -EPERM;
 
+	/* Don't change the buffer size if it's in use */
+	guard(spinlock)(&drvdata->spinlock);
+	if (coresight_get_mode(drvdata->csdev) != CS_MODE_DISABLED)
+		return -EBUSY;
+
 	ret = kstrtoul(buf, 0, &val);
 	if (ret)
 		return ret;