Message ID | 20240515125136.3714580-1-libaokun@huaweicloud.com (mailing list archive) |
---|---|
Headers | show |
Series | cachefiles: some bugfixes for clean object/send req/poll | expand |
A gentle ping. On 2024/5/15 20:51, libaokun@huaweicloud.com wrote: > From: Baokun Li <libaokun1@huawei.com> > > Hi all! > > This is the second version of this patch series. Thank you, Jia Zhu and > Gao Xiang, for the feedback in the previous version. > > We've been testing ondemand mode for cachefiles since January, and we're > almost done. We hit a lot of issues during the testing period, and this > patch set fixes some of the issues related to reopen worker/send req/poll. > The patches have passed internal testing without regression. > > Patch 1-3: A read request waiting for reopen could be closed maliciously > before the reopen worker is executing or waiting to be scheduled. So > ondemand_object_worker() may be called after the info and object and even > the cache have been freed and trigger use-after-free. So use > cancel_work_sync() in cachefiles_ondemand_clean_object() to cancel the > reopen worker or wait for it to finish. Since it makes no sense to wait > for the daemon to complete the reopen request, to avoid this pointless > operation blocking cancel_work_sync(), Patch 1 avoids request generation > by the DROPPING state when the request has not been sent, and Patch 2 > flushes the requests of the current object before cancel_work_sync(). > > Patch 4: Cyclic allocation of msg_id to avoid msg_id reuse misleading > the daemon to cause hung. > > Patch 5: Hold xas_lock during polling to avoid dereferencing reqs causing > use-after-free. This issue was triggered frequently in our tests, and we > found that anolis 5.10 had fixed it, so to avoid failing the test, this > patch was pushed upstream as well. > > Comments and questions are, as always, welcome. > Please let me know what you think. > > Thanks, > Baokun > > Changes since v1: > * Collect RVB from Jia Zhu and Gao Xiang.(Thanks for your review!) > * Pathch 1,2:Add more commit messages. > * Pathch 3:Add Fixes tag as suggested by Jia Zhu. > * Pathch 4:No longer changing "do...while" to "retry" to focus changes > and optimise commit messages. > * Pathch 5: Drop the internal RVB tag. > > [V1]: https://lore.kernel.org/all/20240424033409.2735257-1-libaokun@huaweicloud.com > > Baokun Li (3): > cachefiles: stop sending new request when dropping object > cachefiles: flush all requests for the object that is being dropped > cachefiles: cyclic allocation of msg_id to avoid reuse > > Hou Tao (1): > cachefiles: flush ondemand_object_worker during clean object > > Jingbo Xu (1): > cachefiles: add missing lock protection when polling > > fs/cachefiles/daemon.c | 4 ++-- > fs/cachefiles/internal.h | 3 +++ > fs/cachefiles/ondemand.c | 52 +++++++++++++++++++++++++++++++++++----- > 3 files changed, 51 insertions(+), 8 deletions(-) >
On 2024/6/26 11:04, Baokun Li wrote:
> A gentle ping.
Since it's been long time, I guess you could just resend
a new patchset with collected new tags instead of just
ping for the next round review?
Thanks,
Gao Xiang
On 2024/6/26 11:28, Gao Xiang wrote: > > > On 2024/6/26 11:04, Baokun Li wrote: >> A gentle ping. > > Since it's been long time, I guess you could just resend > a new patchset with collected new tags instead of just > ping for the next round review? > > Thanks, > Gao Xiang Okay, if there's still no feedback this week, I'll resend this patch series. Since both patch sets under review are now 5 patches and have similar titles, it would be more confusing if they both had RESEND. So when I resend it I will merge the two patch sets into one patch series.
On 2024/6/27 09:49, Baokun Li wrote: > On 2024/6/26 11:28, Gao Xiang wrote: >> >> >> On 2024/6/26 11:04, Baokun Li wrote: >>> A gentle ping. >> >> Since it's been long time, I guess you could just resend >> a new patchset with collected new tags instead of just >> ping for the next round review? >> >> Thanks, >> Gao Xiang > > Okay, if there's still no feedback this week, I'll resend this patch series. > > Since both patch sets under review are now 5 patches and have similar > titles, it would be more confusing if they both had RESEND. So when I > resend it I will merge the two patch sets into one patch series. Sounds fine, I think you could rearrange the RESEND patchset with the following order cachefiles: some bugfixes for withdraw and xattr cachefiles: some bugfixes for clean object/send req/poll Jingbo currently is working on the internal stuff, I will try to review myself for this work too. Thanks, Gao Xiaang >
On 2024/6/27 10:08, Gao Xiang wrote: > > > On 2024/6/27 09:49, Baokun Li wrote: >> On 2024/6/26 11:28, Gao Xiang wrote: >>> >>> >>> On 2024/6/26 11:04, Baokun Li wrote: >>>> A gentle ping. >>> >>> Since it's been long time, I guess you could just resend >>> a new patchset with collected new tags instead of just >>> ping for the next round review? >>> >>> Thanks, >>> Gao Xiang >> >> Okay, if there's still no feedback this week, I'll resend this patch >> series. >> >> Since both patch sets under review are now 5 patches and have similar >> titles, it would be more confusing if they both had RESEND. So when I >> resend it I will merge the two patch sets into one patch series. > > Sounds fine, I think you could rearrange the RESEND patchset with > the following order > cachefiles: some bugfixes for withdraw and xattr > cachefiles: some bugfixes for clean object/send req/poll Okay, I'll arrange the patches in that order. Thank you for your suggestion! > > Jingbo currently is working on the internal stuff, I will try to > review myself for this work too. > > Thanks, > Gao Xiaang Thank you so much for helping to review these patches.
On Wed, 2024-05-15 at 20:51 +0800, libaokun@huaweicloud.com wrote: > From: Baokun Li <libaokun1@huawei.com> > > Hi all! > > This is the second version of this patch series. Thank you, Jia Zhu and > Gao Xiang, for the feedback in the previous version. > > We've been testing ondemand mode for cachefiles since January, and we're > almost done. We hit a lot of issues during the testing period, and this > patch set fixes some of the issues related to reopen worker/send req/poll. > The patches have passed internal testing without regression. > > Patch 1-3: A read request waiting for reopen could be closed maliciously > before the reopen worker is executing or waiting to be scheduled. So > ondemand_object_worker() may be called after the info and object and even > the cache have been freed and trigger use-after-free. So use > cancel_work_sync() in cachefiles_ondemand_clean_object() to cancel the > reopen worker or wait for it to finish. Since it makes no sense to wait > for the daemon to complete the reopen request, to avoid this pointless > operation blocking cancel_work_sync(), Patch 1 avoids request generation > by the DROPPING state when the request has not been sent, and Patch 2 > flushes the requests of the current object before cancel_work_sync(). > > Patch 4: Cyclic allocation of msg_id to avoid msg_id reuse misleading > the daemon to cause hung. > > Patch 5: Hold xas_lock during polling to avoid dereferencing reqs causing > use-after-free. This issue was triggered frequently in our tests, and we > found that anolis 5.10 had fixed it, so to avoid failing the test, this > patch was pushed upstream as well. > > Comments and questions are, as always, welcome. > Please let me know what you think. > > Thanks, > Baokun > > Changes since v1: > * Collect RVB from Jia Zhu and Gao Xiang.(Thanks for your review!) > * Pathch 1,2:Add more commit messages. > * Pathch 3:Add Fixes tag as suggested by Jia Zhu. > * Pathch 4:No longer changing "do...while" to "retry" to focus changes > and optimise commit messages. > * Pathch 5: Drop the internal RVB tag. > > [V1]: https://lore.kernel.org/all/20240424033409.2735257-1-libaokun@huaweicloud.com > > Baokun Li (3): > cachefiles: stop sending new request when dropping object > cachefiles: flush all requests for the object that is being dropped > cachefiles: cyclic allocation of msg_id to avoid reuse > > Hou Tao (1): > cachefiles: flush ondemand_object_worker during clean object > > Jingbo Xu (1): > cachefiles: add missing lock protection when polling > > fs/cachefiles/daemon.c | 4 ++-- > fs/cachefiles/internal.h | 3 +++ > fs/cachefiles/ondemand.c | 52 +++++++++++++++++++++++++++++++++++----- > 3 files changed, 51 insertions(+), 8 deletions(-) > The set itself looks fairly straightforward, but I don't know this code well enough to give it a proper review. Acked-by: Jeff Layton <jlayton@kernel.org>
From: Baokun Li <libaokun1@huawei.com> Hi all! This is the second version of this patch series. Thank you, Jia Zhu and Gao Xiang, for the feedback in the previous version. We've been testing ondemand mode for cachefiles since January, and we're almost done. We hit a lot of issues during the testing period, and this patch set fixes some of the issues related to reopen worker/send req/poll. The patches have passed internal testing without regression. Patch 1-3: A read request waiting for reopen could be closed maliciously before the reopen worker is executing or waiting to be scheduled. So ondemand_object_worker() may be called after the info and object and even the cache have been freed and trigger use-after-free. So use cancel_work_sync() in cachefiles_ondemand_clean_object() to cancel the reopen worker or wait for it to finish. Since it makes no sense to wait for the daemon to complete the reopen request, to avoid this pointless operation blocking cancel_work_sync(), Patch 1 avoids request generation by the DROPPING state when the request has not been sent, and Patch 2 flushes the requests of the current object before cancel_work_sync(). Patch 4: Cyclic allocation of msg_id to avoid msg_id reuse misleading the daemon to cause hung. Patch 5: Hold xas_lock during polling to avoid dereferencing reqs causing use-after-free. This issue was triggered frequently in our tests, and we found that anolis 5.10 had fixed it, so to avoid failing the test, this patch was pushed upstream as well. Comments and questions are, as always, welcome. Please let me know what you think. Thanks, Baokun Changes since v1: * Collect RVB from Jia Zhu and Gao Xiang.(Thanks for your review!) * Pathch 1,2:Add more commit messages. * Pathch 3:Add Fixes tag as suggested by Jia Zhu. * Pathch 4:No longer changing "do...while" to "retry" to focus changes and optimise commit messages. * Pathch 5: Drop the internal RVB tag. [V1]: https://lore.kernel.org/all/20240424033409.2735257-1-libaokun@huaweicloud.com Baokun Li (3): cachefiles: stop sending new request when dropping object cachefiles: flush all requests for the object that is being dropped cachefiles: cyclic allocation of msg_id to avoid reuse Hou Tao (1): cachefiles: flush ondemand_object_worker during clean object Jingbo Xu (1): cachefiles: add missing lock protection when polling fs/cachefiles/daemon.c | 4 ++-- fs/cachefiles/internal.h | 3 +++ fs/cachefiles/ondemand.c | 52 +++++++++++++++++++++++++++++++++++----- 3 files changed, 51 insertions(+), 8 deletions(-)