Message ID | 1644349472-31077-1-git-send-email-quic_abhinavk@quicinc.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | devcoredump: increase the device delete timeout to 10 mins | expand |
On Tue, 2022-02-08 at 11:44 -0800, Abhinav Kumar wrote: > There are cases where depending on the size of the devcoredump and the speed > at which the usermode reads the dump, it can take longer than the current 5 mins > timeout. > > This can lead to incomplete dumps as the device is deleted once the timeout expires. > > One example is below where it took 6 mins for the devcoredump to be completely read. > > 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data > 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node > > Increase the timeout to 10 mins to accommodate system delays and large coredump > sizes. > No real objection, I guess, but can the data actually disappear *while* the sysfs file is open?! Or did it take 5 minutes to open the file? If the former, maybe we should fix that too (or instead)? johannes
Hi Johannes Thanks for the response. On 2/8/2022 12:35 PM, Johannes Berg wrote: > On Tue, 2022-02-08 at 11:44 -0800, Abhinav Kumar wrote: >> There are cases where depending on the size of the devcoredump and the speed >> at which the usermode reads the dump, it can take longer than the current 5 mins >> timeout. >> >> This can lead to incomplete dumps as the device is deleted once the timeout expires. >> >> One example is below where it took 6 mins for the devcoredump to be completely read. >> >> 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data >> 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node >> >> Increase the timeout to 10 mins to accommodate system delays and large coredump >> sizes. >> > > No real objection, I guess, but can the data actually disappear *while* > the sysfs file is open?! > > Or did it take 5 minutes to open the file? > > If the former, maybe we should fix that too (or instead)? > > johannes It opened the file rightaway but could not finish reading. The device gets deleted so the corresponding /data will disappear too ( as the data node is under devcd*/data) 60 static void devcd_del(struct work_struct *wk) 61 { 62 struct devcd_entry *devcd; 63 64 devcd = container_of(wk, struct devcd_entry, del_wk.work); 65 66 device_del(&devcd->devcd_dev); 67 put_device(&devcd->devcd_dev); 68 } Are you suggesting we implement a logic like : a) if the usermode has started reading the data but has not finished yet ( we can detect the former with something like devcd->data_read_ongoing = 1 and we know it has finished when it acks and we can clear this flag then), in the timeout del_wk then we can delay the the delete timer by another TIMEOUT amount of time to give usermode time to finish the data? b) If usermode acks, we will clear both the flag and delete the device as usual But there is a corner case here: c) If usermode starts the read, but then for some reason crashes, the timer will timeout and try to delete the device but will detect that usermode is still reading and will keep the device. How do we detect this case? Thats why i thought maybe the easier way right now is to try increasing the timeout.
On Tue, 2022-02-08 at 13:40 -0800, Abhinav Kumar wrote: > > > I am checking what usermode sees and will get back ( I didnt see an > error do most likely it was EOF ). I didnt follow the second part. I think probably it got -ENODEV, looking at kernfs_file_read_iter(). > If the file descriptor read returns EOF, even if we consider them > separate how will it resolve this issue? > > My earlier questions were related to fixing it in devcoredump to detect > and fix it there. Are you suggesting to fix in usermode instead? How? > Yeah, no, you cannot fix it in userspace. But I just followed the rabbit hole down kernfs and all, and it looks like indeed the read would be cut short with -ENODEV, sorry. It doesn't look like there's good API for this, but it seems at least from the underlying kernfs POV it should be possible to get_device() in open and put_device() in release, so that the device sticks around while somebody has the file open? It's entirely virtual, so this should be OK? johannes
Hi Johannes On 2/8/2022 1:54 PM, Johannes Berg wrote: > On Tue, 2022-02-08 at 13:40 -0800, Abhinav Kumar wrote: >>> >> I am checking what usermode sees and will get back ( I didnt see an >> error do most likely it was EOF ). I didnt follow the second part. > > I think probably it got -ENODEV, looking at kernfs_file_read_iter(). > >> If the file descriptor read returns EOF, even if we consider them >> separate how will it resolve this issue? >> >> My earlier questions were related to fixing it in devcoredump to detect >> and fix it there. Are you suggesting to fix in usermode instead? How? >> > > Yeah, no, you cannot fix it in userspace. > > But I just followed the rabbit hole down kernfs and all, and it looks > like indeed the read would be cut short with -ENODEV, sorry. > > It doesn't look like there's good API for this, but it seems at least > from the underlying kernfs POV it should be possible to get_device() in > open and put_device() in release, so that the device sticks around while > somebody has the file open? It's entirely virtual, so this should be OK? > > johannes Are you suggesting something like below? diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index 42dcf96..14203d0 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -32,6 +32,22 @@ static const struct sysfs_ops *sysfs_file_ops(struct kernfs_node *kn) return kobj->ktype ? kobj->ktype->sysfs_ops : NULL; } +static int sysfs_kf_open(struct kernfs_open_file *of) +{ + struct kobject *kobj = of->kn->parent->priv; + struct device *dev = kobj_to_dev(kobj); + + get_device(dev); +} + +static void sysfs_kf_release(struct kernfs_open_file *of) +{ + struct kobject *kobj = of->kn->parent->priv; + struct device *dev = kobj_to_dev(kobj); + + put_device(dev); +} + /* * Reads on sysfs are handled through seq_file, which takes care of hairy * details like buffering and seeking. The following function pipes @@ -211,6 +227,8 @@ static const struct kernfs_ops sysfs_file_kfops_wo = { }; static const struct kernfs_ops sysfs_file_kfops_rw = { + .open = sysfs_kf_open; + .release = sysfs_kf_release; .seq_show = sysfs_kf_seq_show, .write = sysfs_kf_write, }; If so, dont you think this will be a more intrusive change just for the sake of devcoredump? Any other way to keep the changes limited to devcoredump? Thanks Abhinav
On Tue, 2022-02-08 at 17:55 -0800, Abhinav Kumar wrote: > > Are you suggesting something like below? > > diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c > index 42dcf96..14203d0 100644 > --- a/fs/sysfs/file.c > No, for sure not, but I guess from the looks of this patch there's no way to do something like that for just an individual attribute... Oh well. johannes
Hi Johannes On 2/8/2022 11:50 PM, Johannes Berg wrote: > On Tue, 2022-02-08 at 17:55 -0800, Abhinav Kumar wrote: >> >> Are you suggesting something like below? >> >> diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c >> index 42dcf96..14203d0 100644 >> --- a/fs/sysfs/file.c >> > > No, for sure not, but I guess from the looks of this patch there's no > way to do something like that for just an individual attribute... > > Oh well. > > johannes In that case, I was not clear on the previous solution you suggested. Are you suggesting then we can go ahead with the timeout increase? If so, can I please get your ack? Thanks Abhinav
On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote: > There are cases where depending on the size of the devcoredump and the speed > at which the usermode reads the dump, it can take longer than the current 5 mins > timeout. > > This can lead to incomplete dumps as the device is deleted once the timeout expires. > > One example is below where it took 6 mins for the devcoredump to be completely read. > > 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data > 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node What makes this so slow? Reading from the kernel shouldn't be the limit, is it where the data is being sent to? > Increase the timeout to 10 mins to accommodate system delays and large coredump > sizes. Nit, please wrap your changelog texts at 72 columns. And what is "large"? thanks, greg k-h
On Tue, Feb 08, 2022 at 05:55:18PM -0800, Abhinav Kumar wrote: > Hi Johannes > > On 2/8/2022 1:54 PM, Johannes Berg wrote: > > On Tue, 2022-02-08 at 13:40 -0800, Abhinav Kumar wrote: > > > > > > > I am checking what usermode sees and will get back ( I didnt see an > > > error do most likely it was EOF ). I didnt follow the second part. > > > > I think probably it got -ENODEV, looking at kernfs_file_read_iter(). > > > > > If the file descriptor read returns EOF, even if we consider them > > > separate how will it resolve this issue? > > > > > > My earlier questions were related to fixing it in devcoredump to detect > > > and fix it there. Are you suggesting to fix in usermode instead? How? > > > > > > > Yeah, no, you cannot fix it in userspace. > > > > But I just followed the rabbit hole down kernfs and all, and it looks > > like indeed the read would be cut short with -ENODEV, sorry. > > > > It doesn't look like there's good API for this, but it seems at least > > from the underlying kernfs POV it should be possible to get_device() in > > open and put_device() in release, so that the device sticks around while > > somebody has the file open? It's entirely virtual, so this should be OK? > > > > johannes > > Are you suggesting something like below? > > diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c > index 42dcf96..14203d0 100644 > --- a/fs/sysfs/file.c > +++ b/fs/sysfs/file.c > @@ -32,6 +32,22 @@ static const struct sysfs_ops *sysfs_file_ops(struct > kernfs_node *kn) > return kobj->ktype ? kobj->ktype->sysfs_ops : NULL; > } > > +static int sysfs_kf_open(struct kernfs_open_file *of) > +{ > + struct kobject *kobj = of->kn->parent->priv; > + struct device *dev = kobj_to_dev(kobj); > + > + get_device(dev); > +} > + > +static void sysfs_kf_release(struct kernfs_open_file *of) > +{ > + struct kobject *kobj = of->kn->parent->priv; > + struct device *dev = kobj_to_dev(kobj); > + > + put_device(dev); > +} That obviously does not work as not everything in sysfs is a struct device :(
Hi Greg Thanks for the response. On 2/11/2022 3:09 AM, Greg KH wrote: > On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote: >> There are cases where depending on the size of the devcoredump and the speed >> at which the usermode reads the dump, it can take longer than the current 5 mins >> timeout. >> >> This can lead to incomplete dumps as the device is deleted once the timeout expires. >> >> One example is below where it took 6 mins for the devcoredump to be completely read. >> >> 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data >> 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node > > What makes this so slow? Reading from the kernel shouldn't be the > limit, is it where the data is being sent to? We are still checking this. We are seeing better read times when we bump up the thread priority of the thread which was reading this. We are also trying to check if bumping up CPU speed is helping. But, results have not been consistently good enough. So we thought we should also increase the timeout to be safe. > >> Increase the timeout to 10 mins to accommodate system delays and large coredump >> sizes. > > Nit, please wrap your changelog texts at 72 columns. > Yes, i will fix this when I re-post. > And what is "large"? We are seeing devcoredumps in the range of 2.5MB-3MB. I can also mention this in the commit text in the next post. Thanks Abhinav > > thanks, > > greg k-h
On Fri, Feb 11, 2022 at 10:59:39AM -0800, Abhinav Kumar wrote: > Hi Greg > > Thanks for the response. > > On 2/11/2022 3:09 AM, Greg KH wrote: > > On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote: > > > There are cases where depending on the size of the devcoredump and the speed > > > at which the usermode reads the dump, it can take longer than the current 5 mins > > > timeout. > > > > > > This can lead to incomplete dumps as the device is deleted once the timeout expires. > > > > > > One example is below where it took 6 mins for the devcoredump to be completely read. > > > > > > 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data > > > 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node > > > > What makes this so slow? Reading from the kernel shouldn't be the > > limit, is it where the data is being sent to? > > We are still checking this. We are seeing better read times when we bump up > the thread priority of the thread which was reading this. Where is the thread sending the data to? > We are also trying to check if bumping up CPU speed is helping. > But, results have not been consistently good enough. So we thought we should > also increase the timeout to be safe. Why would 10 minutes be better than 30? What should the limit be? :) thanks, greg k-h
Hi Greg On 2/11/2022 11:04 PM, Greg KH wrote: > On Fri, Feb 11, 2022 at 10:59:39AM -0800, Abhinav Kumar wrote: >> Hi Greg >> >> Thanks for the response. >> >> On 2/11/2022 3:09 AM, Greg KH wrote: >>> On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote: >>>> There are cases where depending on the size of the devcoredump and the speed >>>> at which the usermode reads the dump, it can take longer than the current 5 mins >>>> timeout. >>>> >>>> This can lead to incomplete dumps as the device is deleted once the timeout expires. >>>> >>>> One example is below where it took 6 mins for the devcoredump to be completely read. >>>> >>>> 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data >>>> 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node >>> >>> What makes this so slow? Reading from the kernel shouldn't be the >>> limit, is it where the data is being sent to? >> >> We are still checking this. We are seeing better read times when we bump up >> the thread priority of the thread which was reading this. > > Where is the thread sending the data to? The thread is writing the data to a file in local storage. From our profiling, the read is the one taking the time not the write. > >> We are also trying to check if bumping up CPU speed is helping. >> But, results have not been consistently good enough. So we thought we should >> also increase the timeout to be safe. > > Why would 10 minutes be better than 30? What should the limit be? :) Again, this is from our profiling. We are seeing a worst case time of 7 mins to finish the read for our data. Thats where the 10mins came from. Just doubling what we have currently. I am not sure how the current 5 mins timeout came from. > > thanks, > > greg k-h
On Fri, 2022-02-11 at 23:52 -0800, Abhinav Kumar wrote: > > The thread is writing the data to a file in local storage. From our > profiling, the read is the one taking the time not the write. > That seems kind of hard to believe, let's say it's a 4/3 split (4 minutes reading, 3 minutes writing, to make read > write as you say), and 3MiB size, that'd mean you get 12.8KiB/sec? That seems implausibly low, unless you're reading with really tiny buffers? Can you strace this somehow? (with timestamp info) > Just doubling what we have currently. I am not sure how the current 5 > mins timeout came from. > To be honest it came out of thin air, and wasn't really meant as a limit on how fast you can read (feels like even if it's tens of MiB you should read it in milliseconds into userspace), but more of a maximum time that we're willing to waste kernel memory if nobody is around to read the data. I thought it'd be better if we could somehow pin it while the userspace is reading it, but OTOH maybe that's actually bad, since that means userspace (though suitably privileged) could pin this kernel memory indefinitely. johannes
On Fri, Feb 11, 2022 at 11:52:41PM -0800, Abhinav Kumar wrote: > Hi Greg > > On 2/11/2022 11:04 PM, Greg KH wrote: > > On Fri, Feb 11, 2022 at 10:59:39AM -0800, Abhinav Kumar wrote: > > > Hi Greg > > > > > > Thanks for the response. > > > > > > On 2/11/2022 3:09 AM, Greg KH wrote: > > > > On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote: > > > > > There are cases where depending on the size of the devcoredump and the speed > > > > > at which the usermode reads the dump, it can take longer than the current 5 mins > > > > > timeout. > > > > > > > > > > This can lead to incomplete dumps as the device is deleted once the timeout expires. > > > > > > > > > > One example is below where it took 6 mins for the devcoredump to be completely read. > > > > > > > > > > 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data > > > > > 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node > > > > > > > > What makes this so slow? Reading from the kernel shouldn't be the > > > > limit, is it where the data is being sent to? > > > > > > We are still checking this. We are seeing better read times when we bump up > > > the thread priority of the thread which was reading this. > > > > Where is the thread sending the data to? > > The thread is writing the data to a file in local storage. From our > profiling, the read is the one taking the time not the write. The read is coming directly from memory, there should not be any slowdown at all here. How can that be the delay? Have a trace somewhere? thanks, greg k-h
Hi Greg On 2/12/2022 12:29 AM, Greg KH wrote: > On Fri, Feb 11, 2022 at 11:52:41PM -0800, Abhinav Kumar wrote: >> Hi Greg >> >> On 2/11/2022 11:04 PM, Greg KH wrote: >>> On Fri, Feb 11, 2022 at 10:59:39AM -0800, Abhinav Kumar wrote: >>>> Hi Greg >>>> >>>> Thanks for the response. >>>> >>>> On 2/11/2022 3:09 AM, Greg KH wrote: >>>>> On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote: >>>>>> There are cases where depending on the size of the devcoredump and the speed >>>>>> at which the usermode reads the dump, it can take longer than the current 5 mins >>>>>> timeout. >>>>>> >>>>>> This can lead to incomplete dumps as the device is deleted once the timeout expires. >>>>>> >>>>>> One example is below where it took 6 mins for the devcoredump to be completely read. >>>>>> >>>>>> 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data >>>>>> 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node >>>>> >>>>> What makes this so slow? Reading from the kernel shouldn't be the >>>>> limit, is it where the data is being sent to? >>>> >>>> We are still checking this. We are seeing better read times when we bump up >>>> the thread priority of the thread which was reading this. >>> >>> Where is the thread sending the data to? >> >> The thread is writing the data to a file in local storage. From our >> profiling, the read is the one taking the time not the write. > > The read is coming directly from memory, there should not be any > slowdown at all here. How can that be the delay? Have a trace > somewhere? > > thanks, > > greg k-h Yes, like I mentioned in my previous comment we are still checking why its taking so long. We will update with our findings if we have any. Alright, we will try to capture trace to share and will update this thread if we find something as well.
Hi Johannes On 2/12/2022 12:24 AM, Johannes Berg wrote: > On Fri, 2022-02-11 at 23:52 -0800, Abhinav Kumar wrote: >> >> The thread is writing the data to a file in local storage. From our >> profiling, the read is the one taking the time not the write. >> > > That seems kind of hard to believe, let's say it's a 4/3 split (4 > minutes reading, 3 minutes writing, to make read > write as you say), > and 3MiB size, that'd mean you get 12.8KiB/sec? That seems implausibly > low, unless you're reading with really tiny buffers? > > Can you strace this somehow? (with timestamp info) > Yes, like I have already mentioned in earlier comments, we continue to check whats taking that long. Once we find something from our analysis and also have the trace, will update the thread. >> Just doubling what we have currently. I am not sure how the current 5 >> mins timeout came from. >> > > To be honest it came out of thin air, and wasn't really meant as a limit > on how fast you can read (feels like even if it's tens of MiB you should > read it in milliseconds into userspace), but more of a maximum time that > we're willing to waste kernel memory if nobody is around to read the > data. > > I thought it'd be better if we could somehow pin it while the userspace > is reading it, but OTOH maybe that's actually bad, since that means > userspace (though suitably privileged) could pin this kernel memory > indefinitely. > > johannes
Hi Johannes and Greg On 2/12/2022 12:35 AM, Abhinav Kumar wrote: > Hi Johannes > > On 2/12/2022 12:24 AM, Johannes Berg wrote: >> On Fri, 2022-02-11 at 23:52 -0800, Abhinav Kumar wrote: >>> >>> The thread is writing the data to a file in local storage. From our >>> profiling, the read is the one taking the time not the write. >>> >> >> That seems kind of hard to believe, let's say it's a 4/3 split (4 >> minutes reading, 3 minutes writing, to make read > write as you say), >> and 3MiB size, that'd mean you get 12.8KiB/sec? That seems implausibly >> low, unless you're reading with really tiny buffers? >> >> Can you strace this somehow? (with timestamp info) >> > > Yes, like I have already mentioned in earlier comments, we continue to > check whats taking that long. > > Once we find something from our analysis and also have the trace, will > update the thread. > >>> Just doubling what we have currently. I am not sure how the current 5 >>> mins timeout came from. >>> >> >> To be honest it came out of thin air, and wasn't really meant as a limit >> on how fast you can read (feels like even if it's tens of MiB you should >> read it in milliseconds into userspace), but more of a maximum time that >> we're willing to waste kernel memory if nobody is around to read the >> data. >> >> I thought it'd be better if we could somehow pin it while the userspace >> is reading it, but OTOH maybe that's actually bad, since that means >> userspace (though suitably privileged) could pin this kernel memory >> indefinitely. >> >> johannes So, we were able to narrow down the bottle-neck further. The tiny buffers which Johannes was referring to is coming from the sysfs method below. It defaults to a PAGE_SIZE worth of data which results in taking a lot of time due to many number of reads. If we increase the length to match the size of our data like below we are able to finish the read in almost no-time. --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -184,10 +184,11 @@ static const struct seq_operations kernfs_seq_ops = { static ssize_t kernfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) { struct kernfs_open_file *of = kernfs_of(iocb->ki_filp); - ssize_t len = min_t(size_t, iov_iter_count(iter), PAGE_SIZE); + ssize_t len = min_t(size_t, iov_iter_count(iter), (PAGE_SIZE * 768)); const struct kernfs_ops *ops; char *buf; + pr_err("[hbc debug] %s, len:%d\n", __func__, len); buf = of->prealloc_buf; if (buf) mutex_lock(&of->prealloc_mutex); ( 768 because the size of our dump was ~3MB so that would be ~ 768 * 4kB block sizes ) We also did some profiling around how much increasing the block size helps and here is the data: Block size cost 4KB 229s 8KB 86s 3MB 2s So looks like 2 * block size OR 4 * block size can help quite a bit. Hence, while we are exploring some options like reducing the size of the dump etc, I wanted to also check if increasing the block size to like 4 * 4Kb could be a solution because it will bring down the read times drastically based on our profiling. Thanks Abhinav
From: Abhinav Kumar > Sent: 28 February 2022 21:38 ... > We also did some profiling around how much increasing the block size > helps and here is the data: > > Block size cost > > 4KB 229s > 8KB 86s You must have an O(n^2) operation in there - find it. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Mon, Feb 28, 2022 at 10:49 PM David Laight <David.Laight@aculab.com> wrote: > > From: Abhinav Kumar > > Sent: 28 February 2022 21:38 > ... > > We also did some profiling around how much increasing the block size > > helps and here is the data: > > > > Block size cost > > > > 4KB 229s > > 8KB 86s > > You must have an O(n^2) operation in there - find it. The problem is how the devcoredump/sysfs interface works, which results in "re-rendering" the output for each block.. it's fine for moderate size sysfs files, but scales quite badly once you get into couple MB size sysfs files. It could be fixed by having some way to keep state across successive read callbacks. BR, -R > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)
On Tue, 2022-03-01 at 09:45 -0800, Rob Clark wrote: > On Mon, Feb 28, 2022 at 10:49 PM David Laight <David.Laight@aculab.com> wrote: > > > > From: Abhinav Kumar > > > Sent: 28 February 2022 21:38 > > ... > > > We also did some profiling around how much increasing the block size > > > helps and here is the data: > > > > > > Block size cost > > > > > > 4KB 229s > > > 8KB 86s > > > > You must have an O(n^2) operation in there - find it. > > The problem is how the devcoredump/sysfs interface works, which > results in "re-rendering" the output for each block.. it's fine for > moderate size sysfs files, but scales quite badly once you get into > couple MB size sysfs files. > > It could be fixed by having some way to keep state across successive > read callbacks. I'm not sure that's true? Perhaps for dev_coredumpm(), but only if you implemented read() badly. If you have e.g. dev_coredumpsg() or dev_coredumpv() that's just a simple read from the existing buffer. johannes
diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c index f4d794d..6b83ae5 100644 --- a/drivers/base/devcoredump.c +++ b/drivers/base/devcoredump.c @@ -18,8 +18,8 @@ static struct class devcd_class; /* global disable flag, for security purposes */ static bool devcd_disabled; -/* if data isn't read by userspace after 5 minutes then delete it */ -#define DEVCD_TIMEOUT (HZ * 60 * 5) +/* if data isn't read by userspace after 10 minutes then delete it */ +#define DEVCD_TIMEOUT (HZ * 60 * 10) struct devcd_entry { struct device devcd_dev;
There are cases where depending on the size of the devcoredump and the speed at which the usermode reads the dump, it can take longer than the current 5 mins timeout. This can lead to incomplete dumps as the device is deleted once the timeout expires. One example is below where it took 6 mins for the devcoredump to be completely read. 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node Increase the timeout to 10 mins to accommodate system delays and large coredump sizes. Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> --- drivers/base/devcoredump.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)