Message ID | 20240729090639.852732-1-max.kellermann@ionos.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs/ceph/addr: pass using_pgpriv2=false to fscache_write_to_cache() | expand |
For the moment, ceph has to continue using PG_private_2. It doesn't use netfs_writepages(). I have mostly complete patches to fix that, but they got popped onto the back burner for a bit. I've finally managed to get cephfs set up and can now reproduce the hang you're seeing. David
I think the right thing to do is probably to at least partially revert: ae678317b95e760607c7b20b97c9cd4ca9ed6e1a netfs: Remove deprecated use of PG_private_2 as a second writeback flag for the moment. That removed the bit that actually did the write to the cache on behalf of ceph. David
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 8c16bc5250ef..aacea3e8fd6d 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -512,7 +512,7 @@ static void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, b struct fscache_cookie *cookie = ceph_fscache_cookie(ci); fscache_write_to_cache(cookie, inode->i_mapping, off, len, i_size_read(inode), - ceph_fscache_write_terminated, inode, true, caching); + ceph_fscache_write_terminated, inode, false, caching); } #else static inline void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, bool caching)
This piece was missing in commit ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag"). There is one remaining use of PG_private_2: the function __fscache_clear_page_bits(), whose only purpose is to clear PG_private_2. This is done via folio_end_private_2() which also releases the folio reference which was supposed to be taken by folio_start_private_2() (via ceph_set_page_fscache()). __fscache_clear_page_bits() is called by __fscache_write_to_cache(), but only if the parameter using_pgpriv2 is true; the only caller of that function is ceph_fscache_write_to_cache() which still passes true. By calling folio_end_private_2() without folio_start_private_2(), the folio refcounter breaks and causes trouble like RCU stalls and general protection faults. Cc: stable@vger.kernel.org Fixes: ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag") Link: https://lore.kernel.org/ceph-devel/CAKPOu+_DA8XiMAA2ApMj7Pyshve_YWknw8Hdt1=zCy9Y87R1qw@mail.gmail.com/ Signed-off-by: Max Kellermann <max.kellermann@ionos.com> --- fs/ceph/addr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)