diff mbox series

[RFC] swapfile: disable swapon for bs > ps devices

Message ID 20240627000924.2074949-1-mcgrof@kernel.org (mailing list archive)
State New, archived
Headers show
Series [RFC] swapfile: disable swapon for bs > ps devices | expand

Commit Message

Luis Chamberlain June 27, 2024, 12:09 a.m. UTC
Devices which have a requirement for bs > ps cannot be supported for
swap as swap still needs work. Once the block device cache sets the
min order for block devices [0] we need this stop gap otherwise all
swap operations are rejected.

[0] https://lore.kernel.org/all/20240510102906.51844-6-hare@kernel.org/T/#md09501306c649dd84db0a711f9359570c17a197f

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---

This is super *way* forward looking after LBS patches and once we square away
how to support things on the block device cache. Only then does it make
sense to start to consider this. But this is just a stop gap.

But if you think about it, in practice since we are going forward with a
world where we have AWUPF >= NPWG to enable the physical_block_size to
be >= NPWG, the corner case we want to help users *try* to avoid is to enable
swap not when the LBA format is > PAGE_SIZE (although for sport we can
support that) but when the NPWG > PAGE_SIZE. So we'd warn about that until
swap gets a facelift. That is 4k writes will work for devices with 4k
LBA format for example but NPWG = 16k, they would work with a RMW
penalty, just as RMWs today happen with drives formatted with 512 LBA
format and today's default world of 4k IU.

As it turns out we have no topology information for the IU today.  It used
to be that physical_block_size used to have a language about RMW.
During the 2024 LSFMM thread about Large Block for IO that Hannes
proposed we reviewed this discrepancy [1] but we seemed to conclude then
that no changes are required.

I'm starting to think that exposing the IU might make sense now. The
below would not capture the case of the IU > PAGE_SIZE, in theory that
should work but then its just RMWs, but users likely should be informed
it is stupid for them to do that. The other more important use case
would be for STATX_DIOALIGN for the dio_offset_align. That seems
incorrect today even for existing drives with 4k IU and 512 LBA format.

Thoughts?

[1] https://lore.kernel.org/all/ZekfZdchUnRZoebo@bombadil.infradead.org/

 mm/swapfile.c | 5 +++++
 1 file changed, 5 insertions(+)
diff mbox series

Patch

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2f5203aa2d2c..9ff168760bc2 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3153,6 +3153,11 @@  SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 		goto bad_swap_unlock_inode;
 	}
 
+	if (mapping_min_folio_order(mapping) > 0) {
+		error = -EINVAL;
+		goto bad_swap_unlock_inode;
+	}
+
 	/*
 	 * Read the swap header.
 	 */