Message ID | 20220419045045.1664996-1-ruansy.fnst@fujitsu.com (mailing list archive) |
---|---|
Headers | show |
Series | fsdax: introduce fs query to support reflink | expand |
Hi Ruan, On Tue, Apr 19, 2022 at 12:50:38PM +0800, Shiyang Ruan wrote: > This patchset is aimed to support shared pages tracking for fsdax. Now that this is largely reviewed, it's time to work out the logistics of merging it. > Changes since V12: > - Rebased onto next-20220414 What does this depend on that is in the linux-next kernel? i.e. can this be applied successfully to a v5.18-rc2 kernel without needing to drag in any other patchsets/commits/trees? What are your plans for the followup patches that enable reflink+fsdax in XFS? AFAICT that patchset hasn't been posted for while so I don't know what it's status is. Is that patchset anywhere near ready for merge in this cycle? If that patchset is not a candidate for this cycle, then it largely doesn't matter what tree this is merged through as there shouldn't be any major XFS or dax dependencies being built on top of it during this cycle. The filesystem side changes are isolated and won't conflict with other work in XFS, either, so this could easily go through Dan's tree. However, if the reflink enablement is ready to go, then this all needs to be in the XFS tree so that we can run it through filesystem level DAX+reflink testing. That will mean we need this in a stable shared topic branch and tighter co-ordination between the trees. So before we go any further we need to know if the dax+reflink enablement patchset is near being ready to merge.... Cheers, Dave.
Hi Dave, 在 2022/4/21 9:20, Dave Chinner 写道: > Hi Ruan, > > On Tue, Apr 19, 2022 at 12:50:38PM +0800, Shiyang Ruan wrote: >> This patchset is aimed to support shared pages tracking for fsdax. > > Now that this is largely reviewed, it's time to work out the > logistics of merging it. Thanks! > >> Changes since V12: >> - Rebased onto next-20220414 > > What does this depend on that is in the linux-next kernel? > > i.e. can this be applied successfully to a v5.18-rc2 kernel without > needing to drag in any other patchsets/commits/trees? Firstly, I tried to apply to v5.18-rc2 but it failed. There are some changes in memory-failure.c, which besides my Patch-02 "mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()" https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=423228ce93c6a283132be38d442120c8e4cdb061 Then, why it is on linux-next is: I was told[1] there is a better fix about "pgoff_address()" in linux-next: "mm: rmap: introduce pfn_mkclean_range() to cleans PTEs" https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=65c9605009f8317bb3983519874d755a0b2ca746 so I rebased my patches to it and dropped one of mine. [1] https://lore.kernel.org/linux-xfs/YkPuooGD139Wpg1v@infradead.org/ > > What are your plans for the followup patches that enable > reflink+fsdax in XFS? AFAICT that patchset hasn't been posted for > while so I don't know what it's status is. Is that patchset anywhere > near ready for merge in this cycle? > > If that patchset is not a candidate for this cycle, then it largely > doesn't matter what tree this is merged through as there shouldn't > be any major XFS or dax dependencies being built on top of it during > this cycle. The filesystem side changes are isolated and won't > conflict with other work in XFS, either, so this could easily go > through Dan's tree. > > However, if the reflink enablement is ready to go, then this all > needs to be in the XFS tree so that we can run it through filesystem > level DAX+reflink testing. That will mean we need this in a stable > shared topic branch and tighter co-ordination between the trees. > > So before we go any further we need to know if the dax+reflink > enablement patchset is near being ready to merge.... The "reflink+fsdax" patchset is here: https://lore.kernel.org/linux-xfs/20210928062311.4012070-1-ruansy.fnst@fujitsu.com/ It was based on v5.15-rc3, I think I should do a rebase. -- Thanks, Ruan. > > Cheers, > > Dave.
[ add Andrew and Naoya ] On Wed, Apr 20, 2022 at 6:48 PM Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote: > > Hi Dave, > > 在 2022/4/21 9:20, Dave Chinner 写道: > > Hi Ruan, > > > > On Tue, Apr 19, 2022 at 12:50:38PM +0800, Shiyang Ruan wrote: > >> This patchset is aimed to support shared pages tracking for fsdax. > > > > Now that this is largely reviewed, it's time to work out the > > logistics of merging it. > > Thanks! > > > > >> Changes since V12: > >> - Rebased onto next-20220414 > > > > What does this depend on that is in the linux-next kernel? > > > > i.e. can this be applied successfully to a v5.18-rc2 kernel without > > needing to drag in any other patchsets/commits/trees? > > Firstly, I tried to apply to v5.18-rc2 but it failed. > > There are some changes in memory-failure.c, which besides my Patch-02 > "mm/hwpoison: fix race between hugetlb free/demotion and > memory_failure_hugetlb()" > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=423228ce93c6a283132be38d442120c8e4cdb061 > > Then, why it is on linux-next is: I was told[1] there is a better fix > about "pgoff_address()" in linux-next: > "mm: rmap: introduce pfn_mkclean_range() to cleans PTEs" > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=65c9605009f8317bb3983519874d755a0b2ca746 > so I rebased my patches to it and dropped one of mine. > > [1] https://lore.kernel.org/linux-xfs/YkPuooGD139Wpg1v@infradead.org/ From my perspective, once something has -mm dependencies it needs to go through Andrew's tree, and if it's going through Andrew's tree I think that means the reflink side of this needs to wait a cycle as there is no stable point that the XFS tree could merge to build on top of. The last reviewed-by this wants before going through there is Naoya's on the memory-failure.c changes.
On Wed, Apr 20, 2022 at 07:20:07PM -0700, Dan Williams wrote: > [ add Andrew and Naoya ] > > On Wed, Apr 20, 2022 at 6:48 PM Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote: > > > > Hi Dave, > > > > 在 2022/4/21 9:20, Dave Chinner 写道: > > > Hi Ruan, > > > > > > On Tue, Apr 19, 2022 at 12:50:38PM +0800, Shiyang Ruan wrote: > > >> This patchset is aimed to support shared pages tracking for fsdax. > > > > > > Now that this is largely reviewed, it's time to work out the > > > logistics of merging it. > > > > Thanks! > > > > > > > >> Changes since V12: > > >> - Rebased onto next-20220414 > > > > > > What does this depend on that is in the linux-next kernel? > > > > > > i.e. can this be applied successfully to a v5.18-rc2 kernel without > > > needing to drag in any other patchsets/commits/trees? > > > > Firstly, I tried to apply to v5.18-rc2 but it failed. > > > > There are some changes in memory-failure.c, which besides my Patch-02 > > "mm/hwpoison: fix race between hugetlb free/demotion and > > memory_failure_hugetlb()" > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=423228ce93c6a283132be38d442120c8e4cdb061 > > > > Then, why it is on linux-next is: I was told[1] there is a better fix > > about "pgoff_address()" in linux-next: > > "mm: rmap: introduce pfn_mkclean_range() to cleans PTEs" > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=65c9605009f8317bb3983519874d755a0b2ca746 > > so I rebased my patches to it and dropped one of mine. > > > > [1] https://lore.kernel.org/linux-xfs/YkPuooGD139Wpg1v@infradead.org/ > > From my perspective, once something has -mm dependencies it needs to > go through Andrew's tree, and if it's going through Andrew's tree I > think that means the reflink side of this needs to wait a cycle as > there is no stable point that the XFS tree could merge to build on top > of. Ngggh. Still? Really? Sure, I'm not a maintainer and just the stand-in patch shepherd for a single release. However, being unable to cleanly merge code we need integrated into our local subsystem tree for integration testing because a patch dependency with another subsystem won't gain a stable commit ID until the next merge window is .... distinctly suboptimal. We know how to do this cleanly, quickly and efficiently - we've been doing cross-subsystem shared git branch co-ordination for VFS/fs/block stuff when needed for many, many years. It's pretty easy to do, just requires clear communication to decide where the source branch will be kept. It doesn't even matter what order Linus then merges the trees - they are self contained and git sorts out the duplicated commits without an issue. I mean, we've been using git for *17 years* now - this stuff should be second nature to maintainers by now. So how is it still considered acceptible for a core kernel subsystem not to have the ability to provide other subsystems with stable commits/branches so we can cleanly develop cross-subsystem functionality quickly and efficiently? > The last reviewed-by this wants before going through there is Naoya's > on the memory-failure.c changes. Naoya? Cheers, Dave.
Hi everyone, On Thu, Apr 21, 2022 at 02:35:02PM +1000, Dave Chinner wrote: > On Wed, Apr 20, 2022 at 07:20:07PM -0700, Dan Williams wrote: > > [ add Andrew and Naoya ] > > > > On Wed, Apr 20, 2022 at 6:48 PM Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote: > > > > > > Hi Dave, > > > > > > 在 2022/4/21 9:20, Dave Chinner 写道: > > > > Hi Ruan, > > > > > > > > On Tue, Apr 19, 2022 at 12:50:38PM +0800, Shiyang Ruan wrote: > > > >> This patchset is aimed to support shared pages tracking for fsdax. > > > > > > > > Now that this is largely reviewed, it's time to work out the > > > > logistics of merging it. > > > > > > Thanks! > > > > > > > > > > >> Changes since V12: > > > >> - Rebased onto next-20220414 > > > > > > > > What does this depend on that is in the linux-next kernel? > > > > > > > > i.e. can this be applied successfully to a v5.18-rc2 kernel without > > > > needing to drag in any other patchsets/commits/trees? > > > > > > Firstly, I tried to apply to v5.18-rc2 but it failed. > > > > > > There are some changes in memory-failure.c, which besides my Patch-02 > > > "mm/hwpoison: fix race between hugetlb free/demotion and > > > memory_failure_hugetlb()" > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=423228ce93c6a283132be38d442120c8e4cdb061 This commit should not logically conflict with patch 2/7 (just mismatch in context) and the conflict can be trivially resolved, i.e. simply defining 2 new functions (unmap_and_kill() and mf_generic_kill_procs()) just below try_to_split_thp_page() (or somewhere else before memory_failure_dev_pagemap()) is a correct resolution. > > > > > > Then, why it is on linux-next is: I was told[1] there is a better fix > > > about "pgoff_address()" in linux-next: > > > "mm: rmap: introduce pfn_mkclean_range() to cleans PTEs" > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=65c9605009f8317bb3983519874d755a0b2ca746 > > > so I rebased my patches to it and dropped one of mine. > > > > > > [1] https://lore.kernel.org/linux-xfs/YkPuooGD139Wpg1v@infradead.org/ > > > > From my perspective, once something has -mm dependencies it needs to > > go through Andrew's tree, and if it's going through Andrew's tree I > > think that means the reflink side of this needs to wait a cycle as > > there is no stable point that the XFS tree could merge to build on top > > of. > > Ngggh. Still? Really? > > Sure, I'm not a maintainer and just the stand-in patch shepherd for > a single release. However, being unable to cleanly merge code we > need integrated into our local subsystem tree for integration > testing because a patch dependency with another subsystem won't gain > a stable commit ID until the next merge window is .... distinctly > suboptimal. > > We know how to do this cleanly, quickly and efficiently - we've been > doing cross-subsystem shared git branch co-ordination for > VFS/fs/block stuff when needed for many, many years. It's pretty > easy to do, just requires clear communication to decide where the > source branch will be kept. It doesn't even matter what order Linus > then merges the trees - they are self contained and git sorts out > the duplicated commits without an issue. > > I mean, we've been using git for *17 years* now - this stuff should > be second nature to maintainers by now. So how is it still > considered acceptible for a core kernel subsystem not to have the > ability to provide other subsystems with stable commits/branches > so we can cleanly develop cross-subsystem functionality quickly and > efficiently? > > > The last reviewed-by this wants before going through there is Naoya's > > on the memory-failure.c changes. > > Naoya? I'll reply to the individual patches soon. Thanks, Naoya Horiguchi
On Thu, Apr 21, 2022 at 02:35:02PM +1000, Dave Chinner wrote: > Sure, I'm not a maintainer and just the stand-in patch shepherd for > a single release. However, being unable to cleanly merge code we > need integrated into our local subsystem tree for integration > testing because a patch dependency with another subsystem won't gain > a stable commit ID until the next merge window is .... distinctly > suboptimal. Yes. Which is why we've taken a lot of mm patchs through other trees, sometimes specilly crafted for that. So I guess in this case we'll just need to take non-trivial dependencies into the XFS tree, and just deal with small merge conflicts for the trivial ones.
On Wed, Apr 20, 2022 at 10:54:59PM -0700, Christoph Hellwig wrote: > On Thu, Apr 21, 2022 at 02:35:02PM +1000, Dave Chinner wrote: > > Sure, I'm not a maintainer and just the stand-in patch shepherd for > > a single release. However, being unable to cleanly merge code we > > need integrated into our local subsystem tree for integration > > testing because a patch dependency with another subsystem won't gain > > a stable commit ID until the next merge window is .... distinctly > > suboptimal. > > Yes. Which is why we've taken a lot of mm patchs through other trees, > sometimes specilly crafted for that. So I guess in this case we'll > just need to take non-trivial dependencies into the XFS tree, and just > deal with small merge conflicts for the trivial ones. OK. As Naoyo has pointed out, the first dependency/conflict Ruan has listed looks trivial to resolve. The second dependency, OTOH, is on a new function added in the patch pointed to. That said, at first glance it looks to be independent of the first two patches in that series so I might just be able to pull that one patch in and have that leave us with a working fsdax+reflink tree. Regardless, I'll wait to see how much work the updated XFS/DAX reflink enablement patchset still requires when Ruan posts it before deciding what to do here. If it isn't going to be a merge candidate, what to do with this patchset is moot because there's little to test without reflink enabled... Cheers, Dave.
On Thu, Apr 21, 2022 at 12:47 AM Dave Chinner <david@fromorbit.com> wrote: > > On Wed, Apr 20, 2022 at 10:54:59PM -0700, Christoph Hellwig wrote: > > On Thu, Apr 21, 2022 at 02:35:02PM +1000, Dave Chinner wrote: > > > Sure, I'm not a maintainer and just the stand-in patch shepherd for > > > a single release. However, being unable to cleanly merge code we > > > need integrated into our local subsystem tree for integration > > > testing because a patch dependency with another subsystem won't gain > > > a stable commit ID until the next merge window is .... distinctly > > > suboptimal. > > > > Yes. Which is why we've taken a lot of mm patchs through other trees, > > sometimes specilly crafted for that. So I guess in this case we'll > > just need to take non-trivial dependencies into the XFS tree, and just > > deal with small merge conflicts for the trivial ones. > > OK. As Naoyo has pointed out, the first dependency/conflict Ruan has > listed looks trivial to resolve. > > The second dependency, OTOH, is on a new function added in the patch > pointed to. That said, at first glance it looks to be independent of > the first two patches in that series so I might just be able to pull > that one patch in and have that leave us with a working > fsdax+reflink tree. > > Regardless, I'll wait to see how much work the updated XFS/DAX > reflink enablement patchset still requires when Ruan posts it before > deciding what to do here. If it isn't going to be a merge > candidate, what to do with this patchset is moot because there's > little to test without reflink enabled... I do have a use case for this work absent the reflink work. Recall we had a conversation about how to communicate "dax-device has been ripped away from the fs" events and we ended up on the idea of reusing ->notify_failure(), but with the device's entire logical address range as the notification span. That will let me unwind and delete the PTE_DEVMAP infrastructure for taking extra device references to hold off device-removal. Instead ->notify_failure() arranges for all active DAX mappings to be invalidated and allow the removal to proceed especially since physical removal does not care about software pins.
On Fri, Apr 22, 2022 at 02:27:32PM -0700, Dan Williams wrote: > On Thu, Apr 21, 2022 at 12:47 AM Dave Chinner <david@fromorbit.com> wrote: > > > > On Wed, Apr 20, 2022 at 10:54:59PM -0700, Christoph Hellwig wrote: > > > On Thu, Apr 21, 2022 at 02:35:02PM +1000, Dave Chinner wrote: > > > > Sure, I'm not a maintainer and just the stand-in patch shepherd for > > > > a single release. However, being unable to cleanly merge code we > > > > need integrated into our local subsystem tree for integration > > > > testing because a patch dependency with another subsystem won't gain > > > > a stable commit ID until the next merge window is .... distinctly > > > > suboptimal. > > > > > > Yes. Which is why we've taken a lot of mm patchs through other trees, > > > sometimes specilly crafted for that. So I guess in this case we'll > > > just need to take non-trivial dependencies into the XFS tree, and just > > > deal with small merge conflicts for the trivial ones. > > > > OK. As Naoyo has pointed out, the first dependency/conflict Ruan has > > listed looks trivial to resolve. > > > > The second dependency, OTOH, is on a new function added in the patch > > pointed to. That said, at first glance it looks to be independent of > > the first two patches in that series so I might just be able to pull > > that one patch in and have that leave us with a working > > fsdax+reflink tree. > > > > Regardless, I'll wait to see how much work the updated XFS/DAX > > reflink enablement patchset still requires when Ruan posts it before > > deciding what to do here. If it isn't going to be a merge > > candidate, what to do with this patchset is moot because there's > > little to test without reflink enabled... > > I do have a use case for this work absent the reflink work. Recall we > had a conversation about how to communicate "dax-device has been > ripped away from the fs" events and we ended up on the idea of reusing > ->notify_failure(), but with the device's entire logical address range > as the notification span. That will let me unwind and delete the > PTE_DEVMAP infrastructure for taking extra device references to hold > off device-removal. Instead ->notify_failure() arranges for all active > DAX mappings to be invalidated and allow the removal to proceed > especially since physical removal does not care about software pins. Sure. My point is that if the reflink enablement isn't ready to go, then from an XFS POV none of this matters in this cycle and we can just leave the dependencies to commit via Andrew's tree. Hence by the time we get to the reflink enablement all the prior dependencies will have been merged and have stable commit IDs, and we can just stage this series and the reflink enablement as we normally would in the next cycle. However, if we don't get the XFS reflink dax enablement sorted out in the next week or two, then we don't need this patchset in this cycle. Hence if you still need this patchset for other code you need to merge in this cycle, then you're the poor schmuck that has to run the mm-tree conflict guantlet to get a stable commit ID for the dependent patches in this cycle, not me.... Cheers, Dave.
On Fri, Apr 22, 2022 at 5:02 PM Dave Chinner <david@fromorbit.com> wrote: > > On Fri, Apr 22, 2022 at 02:27:32PM -0700, Dan Williams wrote: > > On Thu, Apr 21, 2022 at 12:47 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > On Wed, Apr 20, 2022 at 10:54:59PM -0700, Christoph Hellwig wrote: > > > > On Thu, Apr 21, 2022 at 02:35:02PM +1000, Dave Chinner wrote: > > > > > Sure, I'm not a maintainer and just the stand-in patch shepherd for > > > > > a single release. However, being unable to cleanly merge code we > > > > > need integrated into our local subsystem tree for integration > > > > > testing because a patch dependency with another subsystem won't gain > > > > > a stable commit ID until the next merge window is .... distinctly > > > > > suboptimal. > > > > > > > > Yes. Which is why we've taken a lot of mm patchs through other trees, > > > > sometimes specilly crafted for that. So I guess in this case we'll > > > > just need to take non-trivial dependencies into the XFS tree, and just > > > > deal with small merge conflicts for the trivial ones. > > > > > > OK. As Naoyo has pointed out, the first dependency/conflict Ruan has > > > listed looks trivial to resolve. > > > > > > The second dependency, OTOH, is on a new function added in the patch > > > pointed to. That said, at first glance it looks to be independent of > > > the first two patches in that series so I might just be able to pull > > > that one patch in and have that leave us with a working > > > fsdax+reflink tree. > > > > > > Regardless, I'll wait to see how much work the updated XFS/DAX > > > reflink enablement patchset still requires when Ruan posts it before > > > deciding what to do here. If it isn't going to be a merge > > > candidate, what to do with this patchset is moot because there's > > > little to test without reflink enabled... > > > > I do have a use case for this work absent the reflink work. Recall we > > had a conversation about how to communicate "dax-device has been > > ripped away from the fs" events and we ended up on the idea of reusing > > ->notify_failure(), but with the device's entire logical address range > > as the notification span. That will let me unwind and delete the > > PTE_DEVMAP infrastructure for taking extra device references to hold > > off device-removal. Instead ->notify_failure() arranges for all active > > DAX mappings to be invalidated and allow the removal to proceed > > especially since physical removal does not care about software pins. > > Sure. My point is that if the reflink enablement isn't ready to go, > then from an XFS POV none of this matters in this cycle and we can > just leave the dependencies to commit via Andrew's tree. Hence by > the time we get to the reflink enablement all the prior dependencies > will have been merged and have stable commit IDs, and we can just > stage this series and the reflink enablement as we normally would in > the next cycle. > > However, if we don't get the XFS reflink dax enablement sorted out > in the next week or two, then we don't need this patchset in this > cycle. Hence if you still need this patchset for other code you need > to merge in this cycle, then you're the poor schmuck that has to run > the mm-tree conflict guantlet to get a stable commit ID for the > dependent patches in this cycle, not me.... Yup. Let's give it another week or so to see if the reflink rebase materializes and go from there.