From patchwork Tue Jan 7 09:43:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Gomez X-Patchwork-Id: 13928573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58A77E77197 for ; Tue, 7 Jan 2025 09:43:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CFBBA8D0005; Tue, 7 Jan 2025 04:43:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CAAB88D0001; Tue, 7 Jan 2025 04:43:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AFCD38D0005; Tue, 7 Jan 2025 04:43:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 870248D0001 for ; Tue, 7 Jan 2025 04:43:55 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D3546C0844 for ; Tue, 7 Jan 2025 09:43:54 +0000 (UTC) X-FDA: 82980169188.19.2FA7617 Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by imf27.hostedemail.com (Postfix) with ESMTP id C7AA04000C for ; Tue, 7 Jan 2025 09:43:51 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=DvmjeJ39; spf=pass (imf27.hostedemail.com: domain of da.gomez@samsung.com designates 210.118.77.12 as permitted sender) smtp.mailfrom=da.gomez@samsung.com; dmarc=pass (policy=none) header.from=samsung.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736243032; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=TlJSuzj1lB04GQShBOE98UADHLsVu8yVE8sjcgK1JoU=; b=n1UP9gkcxTzw6BuVJqCzST/TGXQHFhyR+v3hvgqV350qWs89oMbZOAHqhzR4DEEG1+p6Iz a7Lvs8D3oM9ggX6z+1cl9aBgRvd3zMm3LTd3Y/h+NbB3c/PybZcNVGQmM5lOYCfHd2z6rX 35M7AFcq6Ic+4DG3qGJXAFbYiJnxL74= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736243032; a=rsa-sha256; cv=none; b=UEIn3+InWATIp2IWtCVIA7OpV8Z4zPesHo2EToAVarFeTqIoodlns8OR3cmE5FutGIbEXO H4CDROLB8050lZxyeMQtdHhjqHMwZ6k4qNJUohV7NoMPAJD/KPfbWF80lSEWyEmsWqn+gj w1m6ljdhp9K4RfJJJy+td51p4wM8Y68= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=DvmjeJ39; spf=pass (imf27.hostedemail.com: domain of da.gomez@samsung.com designates 210.118.77.12 as permitted sender) smtp.mailfrom=da.gomez@samsung.com; dmarc=pass (policy=none) header.from=samsung.com Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20250107094349euoutp02a6ac0ab264c1cd8b24cf08e99fa52157~YX9Ye2iGu2828928289euoutp02W for ; Tue, 7 Jan 2025 09:43:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20250107094349euoutp02a6ac0ab264c1cd8b24cf08e99fa52157~YX9Ye2iGu2828928289euoutp02W DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1736243029; bh=TlJSuzj1lB04GQShBOE98UADHLsVu8yVE8sjcgK1JoU=; h=Date:From:To:CC:Subject:References:From; b=DvmjeJ39VsDXnSuMkenX1Ae4KoxiIpTsrkF2kT26cgeypCO+jyvgpC5CrvUA08nHa 3UyjIXZd20NsrH6XJxvTFCbG1oJALPZqBxvomQKAR7NqDaVBWR75OMWVsr3k2nqloe wP+efWmruoDrRoH3QvzGUMv0F3aRrf2wlGJEYgEA= Received: from eusmges2new.samsung.com (unknown [203.254.199.244]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20250107094349eucas1p19d5323a3ae8b209753992906487365a0~YX9YXEwq90427204272eucas1p1-; Tue, 7 Jan 2025 09:43:49 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges2new.samsung.com (EUCPMTA) with SMTP id E9.76.20409.557FC776; Tue, 7 Jan 2025 09:43:49 +0000 (GMT) Received: from eusmtrp2.samsung.com (unknown [182.198.249.139]) by eucas1p1.samsung.com (KnoxPortal) with ESMTPA id 20250107094349eucas1p1c973738624046458bbd8ca980cf6fe33~YX9YELCoS0427204272eucas1p1_; Tue, 7 Jan 2025 09:43:49 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp2.samsung.com (KnoxPortal) with ESMTP id 20250107094349eusmtrp29571c1d1b6150cf2b617f2a1bf55300b~YX9YDkJ9q3103231032eusmtrp2_; Tue, 7 Jan 2025 09:43:49 +0000 (GMT) X-AuditID: cbfec7f4-c0df970000004fb9-99-677cf7553b86 Received: from eusmtip1.samsung.com ( [203.254.199.221]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id 82.35.19654.457FC776; Tue, 7 Jan 2025 09:43:49 +0000 (GMT) Received: from CAMSVWEXC01.scsc.local (unknown [106.1.227.71]) by eusmtip1.samsung.com (KnoxPortal) with ESMTPA id 20250107094348eusmtip1165f9b5a11b08aadec368c133ea72b18~YX9X1TCXF1367413674eusmtip1o; Tue, 7 Jan 2025 09:43:48 +0000 (GMT) Received: from localhost (106.110.32.87) by CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 7 Jan 2025 09:43:48 +0000 Date: Tue, 7 Jan 2025 10:43:47 +0100 From: Daniel Gomez To: David Hildenbrand , Ryan Roberts , Barry Song , Andrew Morton CC: , Luis Chamberlain , Pankaj Raghav Subject: Swap Min Odrer Message-ID: <20250107094347.l37isnk3w2nmpx2i@AALNPWDAGOMEZ1.aal.scsc.local> MIME-Version: 1.0 Content-Disposition: inline X-Originating-IP: [106.110.32.87] X-ClientProxiedBy: CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) To CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprJKsWRmVeSWpSXmKPExsWy7djP87qh32vSDSZNlrCYs34Nm8XX9b+Y Le6t+c9qcWPCU0aLnt1TGS123O1lc2DzWDNvDaPHplWdbB6bPk1i9zgx4zeLx7tZSh7v911l C2CL4rJJSc3JLEst0rdL4MpYu+svW8HldYwVRzu2sTcw/tHrYuTgkBAwkThwp7KLkYtDSGAF o8SVxgXsXYycQM4XRokdzbIQ9mdGiVnTskBskPqHC3YwQTQsZ5Toe/2HGa7owy5NiMRmRonu Y+1sIAkWARWJrgmnwKayCWhK7Du5iR2kSERgAaPExj2fWUASzAJpEp8OrwGzhQXEJbbfOsYI YvMKeEvs2buYFcIWlDg58wkLyNnMQIPW79KHMKUllv/jgJgiL9G8dTYzxKGKEjMmrmSBsGsl 1h47A7ZWQuAGh8T67W/ZIRIuEq2tF5kgbGGJV8e3QMVlJP7vnA8Vz5Y4+30PI4RdIvHvwy+o odYS/9feZYSEoqPEwduyECafxI23ghDn8ElM2jadGSLMK9HRJgTRqCax+t4blgmMyrOQvDUL 4a1ZCG/NQvLWAkaWVYziqaXFuempxUZ5qeV6xYm5xaV56XrJ+bmbGIGp5/S/4192MC5/9VHv ECMTB+MhRgkOZiUR3iyNynQh3pTEyqrUovz4otKc1OJDjNIcLErivKop8qlCAumJJanZqakF qUUwWSYOTqkGJj+X574CemXFf2c8keaRFxRjc92vsTyxqkmTXZD9vMYFn9NR2zYe3qM17WSv 27T1dd1ZrPnuAlWGT0M03yhImunJlUocndSmYvNaQvToJJ81k7Yv+lrEH1Jy++VLzbmGsSpP dobs72GRuOpcuZAnIPLek1SvbxZ195jNj7SeMPn6Qi38mquftD/jizbbh5lrN4jY7lefeX6j VGBWxeZUt1ABL52+RaqnPO7wP1xx7Z2N1cI581sividM/WSdE311V0rfi1UOqozs/eJnii4t +s8srF1bMV2+3vnkBm7l8JulP3xU5LaoFD2PfZqsmbxe6kBh5cGeiY9WH1gRryl0q/5D1leN 24x3rv3JrLJUYinOSDTUYi4qTgQApze0cKwDAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrHIsWRmVeSWpSXmKPExsVy+t/xu7qh32vSDdZeYrSYs34Nm8XX9b+Y Le6t+c9qcWPCU0aLnt1TGS123O1lc2DzWDNvDaPHplWdbB6bPk1i9zgx4zeLx7tZSh7v911l C2CL0rMpyi8tSVXIyC8usVWKNrQw0jO0tNAzMrHUMzQ2j7UyMlXSt7NJSc3JLEst0rdL0MtY u+svW8HldYwVRzu2sTcw/tHrYuTkkBAwkXi4YAdTFyMXh5DAUkaJG0+esEEkZCQ2frnKCmEL S/y51sUGUfSRUWLDAhhnM6PEymNHmECqWARUJLomnGIHsdkENCX2ndwEZosIzGOU6JxrD2Iz C6RJvP34DKxeWEBcYvutY4wgNq+At8SevYtZIWxBiZMzn7B0MXIA1WtKrN+lD2FKSyz/xwEx RV6ieetsZojbFCVmTFzJAmHXSry6v5txAqPQLCSDZiEMmoUwaBaSQQsYWVYxiqSWFuem5xYb 6RUn5haX5qXrJefnbmIERty2Yz+37GBc+eqj3iFGJg7GQ4wSHMxKIrxZGpXpQrwpiZVVqUX5 8UWlOanFhxhNgeEwkVlKNDkfGPN5JfGGZgamhiZmlgamlmbGSuK8bFfOpwkJpCeWpGanphak FsH0MXFwSjUwJSYElTldd5XNtSi4t4rVU+05p7PIX7vDhi/WaBeI7nzK0PS+SGXpB6bJBnG8 /NLdPLYcNbfdXmz2FDHatItPZK7eqekC/+z5G+/8OB3zp5jH6My0yq5+1rINX3a5G55ODAzv 7Vxzuie42SzN5U7rUtulWYdvzfzz/K3Bcp6zG8s1pxU8s5Y+t+GU8cFZxZtm1rGIz/q566cG W+uE4oOarYl5RouyN04VOBjFsN/xgWC71PvokBOnZwdu1g1eyds58fJTexlHFcWDf99Omdso +fJ/Z/vd73ecp1Wu4xbWufCVIXT5/o6io5fsz3NULzYzu6gVql2nvif80u0FRj5y3nJ/Vvtq 7phkbbNqbdQdJZbijERDLeai4kQAL04k4UEDAAA= X-CMS-MailID: 20250107094349eucas1p1c973738624046458bbd8ca980cf6fe33 X-Msg-Generator: CA X-RootMTR: 20250107094349eucas1p1c973738624046458bbd8ca980cf6fe33 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20250107094349eucas1p1c973738624046458bbd8ca980cf6fe33 References: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: C7AA04000C X-Stat-Signature: x97zhodua138z9nci9zwe9yd7udgnr17 X-Rspam-User: X-HE-Tag: 1736243031-49843 X-HE-Meta: U2FsdGVkX1/zKKRvmnS2yOVws/XZW9R1tJmteDlt1zVND4OMUsDslMcIF3dflvqt6ZN2kaHPgkhmQSVOq19Bfnn5CfXvnCd21MTv9iwMHE6tJpOnUsisMkp1ud/zk+zaRCs0q+Sgm9pwaLoGNilEpX8q8DqaFiXGhssomA8dXwRcjkPqIaojERW/67JNiUM9vJB9aetYOEKxllhUcvzmGNolqU/VQ0ljLpGmcuUnviU5qWeQL9nL9ZLilt4fdqiKwjTb2FfNHiQH91IUgsbd7wHWbn8xSzrCwZBCKsv4XGWn2dm7yJmwGT1JXUwuclkz22/8DhQCb4DAY9Ak3La2F2gonfv4BvM1fVyc8qpvRXiCVG+iJePLxZ1vnvUksVkmIRQEODP1ZdG6nJNF9MdkMOfBNWRyejQB6czXr+R7RtRJvfpeNBZzoqdDdhZih/hWy7O5XiZSfBdRU7Bqz3E3giVx8mjkFJUyEgzVZqdPnNq9TSEKcIYd2dHJFs1R/lI8QGfJV+cx6FJp28u4VWMXjQ2HXU8gbq7MNPVvBNOC0nZjvvR4HcvM54QK8dfFYmoDzrwhlG6JugVr04IkWXuUfciMzqVMxS4yFx0uB7pNK7nSVTtgfn/uQkG27kdxTQ7z2FhWSCyJP1xNVfsZvirxTy3LR7IQnXTw4aEqIWI5qPAim+psdwWnM71u1jfbMwGvHMcmdaOfLd9PxxyRflr5uHw9GE2XSq3TE5imLYRoHqLWJNQ8asCSsrl0ugul3MttmKyUK0kdekInW65gKF5A9F67OAeYdLLnuAeUdzfyLoAEAUmhN20eVprIg3eakbijlpeDr0FqPbVJvJPFAxlv3xHdkjH7J3fPe1REP61cxEimAGKu1zbl3r0X3jL+SO2kkce8RodYA0zNXQj6LmMFpE8KaZZFOO2TfTpHCsiGuoKPBro+0s47EkMQzT+4e1UHf8sK9ateNEr/+Fc/02V asIG5rfS 4bMU1sWQYdPWKPgZG47mVB0/eoBVL9HuMbCS8FnwjZVicYiIBwvQ/627fjsjDQcuBb3LNAii/Dy83QXyXHQaVcP7Zckj8fHwNLiPdR0qj+imEqNVd8lMIw04ke6v0FD/cpR6M4EhTx3GGKYkg+5jj5gIp918sHiWvB8NDJBP7DEttMFKRrBSPbTrAEn56rKkI4jOZlErwb1bTk3c= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, High-capacity SSDs require writes to be aligned with the drive's indirection unit (IU), which is typically >4 KiB, to avoid RMW. To support swap on these devices, we need to ensure that writes do not cross IU boundaries. So, I think this may require increasing the minimum allocation size for swap users. As a temporary alternative, a proposal [1] to prevent swap on these devices was previously sent for discussion before LBS was merged in v6.12 [2]. Additional details and reasoning can be found in [1] discussion. [1] https://lore.kernel.org/all/20240627000924.2074949-1-mcgrof@kernel.org/ [2] https://lore.kernel.org/all/20240913-vfs-blocksize-ab40822b2366@brauner/ So, I’d like to bring this up for discussion here and/or propose it as a topic for the next MM bi-weekly meeting if needed. Please let me know if this has already been discussed previously. Given that we already support large folios with mTHP in anon memory and shmem, a similar approach where we avoid falling back to smaller allocations might suffice, as it is done in the page cache with min order. Monitoring writes on a dedicated NVMe with swap enabled with blkalgn tool [3], I get the following results: [3] https://github.com/iovisor/bcc/pull/5128 Swap setup: mkdir -p /mnt/swap sudo mkfs.xfs -b size=16k /dev/nvme0n1 -f sudo mount --types xfs /dev/nvme0n1 /mnt/swap sudo fallocate -l 8192M /mnt/swap/swapfile sudo chmod 600 /mnt/swap/swapfile sudo mkswap /mnt/swap/swapfile sudo swapon /mnt/swap/swapfile Swap stress test (guest with 7.8Gi of RAM): stress --vm-bytes 7859M --vm-keep -m 1 --timeout 300 Results: 1. Vanilla v6.12 no mTHP enabled I/O Alignment Histogram for Device nvme0n1 bytes : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 3255 |****************************************| 8192 -> 16383 : 783 |********* | 16384 -> 32767 : 255 |*** | 32768 -> 65535 : 61 | | 65536 -> 131071 : 24 | | 131072 -> 262143 : 22 | | 262144 -> 524287 : 2136 |************************** | The above represents the alignment of writes in power-of-2 steps for the swap dedicated nvme0n1 device. The corresponding granularity for these alignments is shown in the linear histogram below, where the sector size is 512 Bytes (e.g. for a sector size 8: 8 << 9: 4096 Bytes). So the first count indicates that 821 writes where sent with a size of 4 KiB, and the last one shows that 2441 writes where sent with a size of 512 KiB. I/O Granularity Histogram for Device nvme0n1 Total I/Os: 6536 sector : count distribution 8 : 821 |************* | 16 : 131 |** | 24 : 339 |***** | 32 : 259 |**** | 40 : 114 |* | 48 : 162 |** | 56 : 249 |**** | 64 : 257 |**** | 72 : 157 |** | 80 : 90 |* | 88 : 109 |* | 96 : 188 |*** | 104 : 228 |*** | 112 : 262 |**** | 120 : 81 |* | 128 : 44 | | 136 : 22 | | 144 : 20 | | 152 : 20 | | 160 : 18 | | 168 : 43 | | 176 : 9 | | 184 : 5 | | 192 : 2 | | 200 : 3 | | 208 : 2 | | 216 : 4 | | 224 : 6 | | 232 : 4 | | 240 : 2 | | 248 : 11 | | 256 : 9 | | 264 : 17 | | 272 : 19 | | 280 : 16 | | 288 : 7 | | 296 : 5 | | 304 : 2 | | 312 : 7 | | 320 : 5 | | 328 : 4 | | 336 : 23 | | 344 : 2 | | 352 : 12 | | 360 : 5 | | 368 : 5 | | 376 : 1 | | 384 : 3 | | 392 : 3 | | 400 : 2 | | 408 : 1 | | 416 : 1 | | 424 : 6 | | 432 : 5 | | 440 : 3 | | 448 : 7 | | 456 : 2 | | 472 : 2 | | 480 : 2 | | 488 : 7 | | 496 : 5 | | 504 : 11 | | 520 : 3 | | 528 : 1 | | 536 : 2 | | 544 : 5 | | 560 : 1 | | 568 : 2 | | 576 : 1 | | 584 : 2 | | 592 : 2 | | 600 : 2 | | 608 : 1 | | 616 : 2 | | 624 : 5 | | 632 : 1 | | 640 : 1 | | 648 : 1 | | 656 : 5 | | 664 : 8 | | 672 : 20 | | 680 : 3 | | 688 : 1 | | 704 : 1 | | 712 : 1 | | 720 : 3 | | 728 : 4 | | 736 : 6 | | 744 : 14 | | 752 : 14 | | 760 : 12 | | 768 : 3 | | 776 : 5 | | 784 : 2 | | 792 : 2 | | 800 : 1 | | 808 : 3 | | 816 : 1 | | 824 : 5 | | 832 : 2 | | 840 : 15 | | 848 : 9 | | 856 : 2 | | 864 : 1 | | 872 : 2 | | 880 : 10 | | 888 : 4 | | 896 : 5 | | 904 : 1 | | 920 : 2 | | 936 : 3 | | 944 : 1 | | 952 : 6 | | 960 : 1 | | 968 : 1 | | 976 : 1 | | 984 : 1 | | 992 : 2 | | 1000 : 2 | | 1008 : 16 | | 1016 : 1 | | 1024 : 2441 |****************************************| 2. Vanilla v6.12 with all mTHP enabled: I/O Alignment Histogram for Device nvme0n1 bytes : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 5076 |****************************************| 8192 -> 16383 : 907 |******* | 16384 -> 32767 : 302 |** | 32768 -> 65535 : 141 |* | 65536 -> 131071 : 46 | | 131072 -> 262143 : 35 | | 262144 -> 524287 : 1993 |*************** | 524288 -> 1048575 : 6 | | In addition, I've tested and monitored writes enabling SWP_BLKDEV for regular files to allow large folios for swap files on block devices and check the difference: With the following aligment results: 3. v6.12 + SWP_BLKDEV change with mTHP disabled: I/O Alignment Histogram for Device nvme0n1 bytes : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 146 |***** | 8192 -> 16383 : 23 | | 16384 -> 32767 : 10 | | 32768 -> 65535 : 1 | | 65536 -> 131071 : 3 | | 131072 -> 262143 : 0 | | 262144 -> 524287 : 1020 |****************************************| 4. v6.12 + SWP_BLKDEV change with mTHP enabled: I/O Alignment Histogram for Device nvme0n1 bytes : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 240 |****** | 8192 -> 16383 : 34 | | 16384 -> 32767 : 4 | | 32768 -> 65535 : 0 | | 65536 -> 131071 : 1 | | 131072 -> 262143 : 1 | | 262144 -> 524287 : 1542 |****************************************| 2nd run: I/O Alignment Histogram for Device nvme0n1 bytes : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 356 |************ | 8192 -> 16383 : 74 |** | 16384 -> 32767 : 58 |** | 32768 -> 65535 : 54 |* | 65536 -> 131071 : 37 |* | 131072 -> 262143 : 11 | | 262144 -> 524287 : 1104 |****************************************| 524288 -> 1048575 : 1 | | For comparison, the graph below represents a stress test on a drive with LBS enabled (XFS with 16k block size) with random size writes: I/O Alignment Histogram for Device nvme0n1 Bytes : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 1758 |* | 1024 -> 2047 : 476 | | 2048 -> 4095 : 164 | | 4096 -> 8191 : 42 | | 8192 -> 16383 : 10 | | 16384 -> 32767 : 3629 |*** | 32768 -> 65535 : 47861 |****************************************| 65536 -> 131071 : 25702 |********************* | 131072 -> 262143 : 10791 |********* | 262144 -> 524287 : 11094 |********* | 524288 -> 1048575 : 55 | | The test drive here uses a 512 Byte LBA format and so, writes can start at that boundary. However, LBS/min order allows most of the writes to fall at 16k bounaries or greater. What do you think? Daniel diff --git a/mm/swapfile.c b/mm/swapfile.c index b0a9071cfe1d..80a9dbe9645a 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3128,6 +3128,7 @@ static int claim_swapfile(struct swap_info_struct *si, struct inode *inode) si->flags |= SWP_BLKDEV; } else if (S_ISREG(inode->i_mode)) { si->bdev = inode->i_sb->s_bdev; + si->flags |= SWP_BLKDEV; } return 0;