Message ID | 20190731114935.11030-1-ruansy.fnst@cn.fujitsu.com (mailing list archive) |
---|---|
Headers | show |
Series | xfs: add reflink & dedupe support for fsdax. | expand |
On 19:49 31/07, Shiyang Ruan wrote: > This patchset aims to take care of this issue to make reflink and dedupe > work correctly in XFS. > > It is based on Goldwyn's patchsets: "v4 Btrfs dax support" and "Btrfs > iomap". I picked up some patches related and made a few fix to make it > basically works fine. > > For dax framework: > 1. adapt to the latest change in iomap. > > For XFS: > 1. report the source address and set IOMAP_COW type for those write > operations that need COW. > 2. update extent list at the end. > 3. add file contents comparison function based on dax framework. > 4. use xfs_break_layouts() to support dax. Shiyang, I think you used the older patches which does not contain the iomap changes which would call iomap_begin() with two iomaps. I have it in the btrfs-iomap branch and plan to update it today. It is built on v5.3-rcX, so it should contain the changes which moves the iomap code to the different directory. I will build the dax patches on top of that. However, we are making a big dependency chain here :(
On 8/1/19 4:33 AM, Goldwyn Rodrigues wrote: > On 19:49 31/07, Shiyang Ruan wrote: >> This patchset aims to take care of this issue to make reflink and dedupe >> work correctly in XFS. >> >> It is based on Goldwyn's patchsets: "v4 Btrfs dax support" and "Btrfs >> iomap". I picked up some patches related and made a few fix to make it >> basically works fine. >> >> For dax framework: >> 1. adapt to the latest change in iomap. >> >> For XFS: >> 1. report the source address and set IOMAP_COW type for those write >> operations that need COW. >> 2. update extent list at the end. >> 3. add file contents comparison function based on dax framework. >> 4. use xfs_break_layouts() to support dax. > > Shiyang, > > I think you used the older patches which does not contain the iomap changes > which would call iomap_begin() with two iomaps. I have it in the btrfs-iomap Oh, Sorry for my carelessness. This patchset is built on your "Btrfs iomap". I didn't point it out in cover letter. > branch and plan to update it today. It is built on v5.3-rcX, so it should > contain the changes which moves the iomap code to the different directory. > I will build the dax patches on top of that. > However, we are making a big dependency chain here Don't worry. It's fine for me. I'll follow your updates. >
On Thu, Aug 01, 2019 at 09:37:04AM +0800, Shiyang Ruan wrote: > > > On 8/1/19 4:33 AM, Goldwyn Rodrigues wrote: > > On 19:49 31/07, Shiyang Ruan wrote: > > > This patchset aims to take care of this issue to make reflink and dedupe > > > work correctly in XFS. > > > > > > It is based on Goldwyn's patchsets: "v4 Btrfs dax support" and "Btrfs > > > iomap". I picked up some patches related and made a few fix to make it > > > basically works fine. > > > > > > For dax framework: > > > 1. adapt to the latest change in iomap. > > > > > > For XFS: > > > 1. report the source address and set IOMAP_COW type for those write > > > operations that need COW. > > > 2. update extent list at the end. > > > 3. add file contents comparison function based on dax framework. > > > 4. use xfs_break_layouts() to support dax. > > > > Shiyang, > > > > I think you used the older patches which does not contain the iomap changes > > which would call iomap_begin() with two iomaps. I have it in the btrfs-iomap > > Oh, Sorry for my carelessness. This patchset is built on your "Btrfs > iomap". I didn't point it out in cover letter. > > > branch and plan to update it today. It is built on v5.3-rcX, so it should > > contain the changes which moves the iomap code to the different directory. > > I will build the dax patches on top of that. > > However, we are making a big dependency chain here > Don't worry. It's fine for me. I'll follow your updates. Hi Shiyang, I'll wait for you to update your patches on top of the latest btrfs patches before looking at this any futher. it would be good to get a set of iomap infrastructure patches separated from the btrfs patchsets so could have them both built from a common patchset. Cheers, Dave.
Btw, I just had a chat with Dan last week on this. And he pointed out that while this series deals with the read/write path issues of reflink on DAX it doesn't deal with the mmap side issue that page->mapping and page->index can point back to exactly one file. I think we want a few xfstests that reflink a file and then use the different links using mmap, as that should blow up pretty reliably.
On Tue, Oct 08, 2019 at 11:31:44PM -0700, Christoph Hellwig wrote: > Btw, I just had a chat with Dan last week on this. And he pointed out > that while this series deals with the read/write path issues of > reflink on DAX it doesn't deal with the mmap side issue that > page->mapping and page->index can point back to exactly one file. > > I think we want a few xfstests that reflink a file and then use the > different links using mmap, as that should blow up pretty reliably. Hmm, you're right, we don't actually have a test that checks the behavior of mwriting all copies of a shared block. Ok, I'll go write one. --D
On Wed, Oct 09, 2019 at 10:11:52AM -0700, Darrick J. Wong wrote: > On Tue, Oct 08, 2019 at 11:31:44PM -0700, Christoph Hellwig wrote: > > Btw, I just had a chat with Dan last week on this. And he pointed out > > that while this series deals with the read/write path issues of > > reflink on DAX it doesn't deal with the mmap side issue that > > page->mapping and page->index can point back to exactly one file. > > > > I think we want a few xfstests that reflink a file and then use the > > different links using mmap, as that should blow up pretty reliably. > > Hmm, you're right, we don't actually have a test that checks the > behavior of mwriting all copies of a shared block. Ok, I'll go write > one. I've pointed this problem out to everyone who has asked me "what do we need to do to support reflink on DAX". I've even walked a couple of people right through the problem that needs to be solved and discussed the potential solutions to it. Problems that I think need addressing: - device dax and filesystem dax have fundamentally different needs in this space, so they need to be separated and not try to use the same solution. - dax_lock_entry() being used as a substitute for page_lock() but it not being held on the page itself means it can't be extended to serialise access to the page across multiple mappings that are unaware of each other - dax_lock_page/dax_unlock_page interface for hardware memory errors needs to report to the filesystem for processing and repair, not assume the page is user data and killing processes is the only possible recovery mechanism. - dax_associate_entry/dax_disassociate_entry can only work for a 1:1 page:mapping,index relationship. It needs to go away and be replaced by a mechanism that allows tracking multiple page mapping/index/state tuples. This has much wider use than DAX (e.g. sharing page cache pages between reflinked files) I've proposed shadow pages (based on a concept from Matethw Wilcox) for each read-only reflink mapping with the real physical page being owned by the filesystem and indexed by LBA in the filesystem buffer cache. This would be based on whether the extent in the file the page is mapped from has multiple references to it. i.e. When a new page mapping occurs in a shared extent, we add the page to the buffer cache (i.e. point a struct xfs_buf at it)i if it isn't already present, then allocate a shadow page, point it at the master, set it up with the new mapping,index tuple and add it to the mapping tree. Then we can treat it as a unique page even though it points to the read-only master page. When the page get's COWed, we toss away the shadow page and the master can be reclaimed with the reference count goes to zero or the extent is no longer shared. Think of it kind of like the way we multiply reference the zero page for holes in mmap()d dax regions, except we can have millions of them and they are found by physical buffer cache index lookups. This works for both DAX and non-DAX sharing of read-only shared filesytsem pages. i.e. it would form the basis of single-copy read-only page cache pages for reflinked files. There was quite a bit of talk at LSFMM 2018 about having a linked list of mapping structures hanging off a struct page, one for each mapping that references the page. Operations would then have to walk all mappings that reference the page. This was useful to other subsystems (HMM?) for some purpose I forget, but I'm not sure it's particularly useful by itself for non-dax reflink purposes - I suspect the filesystem would still need to track such pages itself in it's buffer cache so it can find the cached page to link new reflink copies to the same page... ISTR a couple of other solutions were thrown around, but I don't think anyone came up with a simple solution... Cheers, Dave.