Message ID | 13333310fd93096c8b71a35ae66f634ca82a2dc6.1664363162.git.robin.murphy@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: Usercopy refresh | expand |
Hi Robin, On Wed, Sep 28, 2022 at 12:58:52PM +0100, Robin Murphy wrote: > As with its counterpart, replace copy_to_user() with a new and improved > version similarly derived from memcpy(). Different tradeoffs from the > base implementation are made relative to copy_from_user() to get the > most consistent results across different microarchitectures, but the > overall shape of the performance gain ends up about the same. > > The exception fixups are even more comical this time around, but that's > down to now needing to reconstruct the faulting address, and cope with > overlapping stores at various points. Again, improvements to the > exception mechanism could significantly simplify things in future. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > arch/arm64/lib/copy_to_user.S | 391 +++++++++++++++++++++++++++++----- > 1 file changed, 341 insertions(+), 50 deletions(-) Mark reported (off-list) a KASAN splat from Syzkaller on for-next/core: | BUG: KASAN: stack-out-of-bounds in _copy_to_iter+0x1084/0x131c We've not had any reports of this before (or on other branches) and Mark is unable to reproduce the problem on mainline. I suggested trying for-next/uaccess and he confirmed that the problem has been introduced there. Hopefully he can chime in with more details. In the meantime, I think it's safest to drop this branch for now considering where we are in the release cycle. I'll leave it pushed to the arm64 git, so we can put fixes on top and try to merge it next time around instead. One thing I _did_ notice from a quick skim of the code is: > + /* Fixups... */ > + > + /* > + * Fault on the first write, but progress may have been possible; > + * realign dst and retry a single byte to confirm. > + */ > +L(fixup_pre): > + mov dst, dstin > +U_dst( ldtrb A_lw, [src]) > + strb A_lw, [dst], #1 > +L(fixup_dst): > + sub x0, dstend, dst > + ret Why is U_dst() being applied to the load for a copy_to_user() here? I would've expected to annotate the store. Will
On Mon, Dec 05, 2022 at 04:09:10PM +0000, Will Deacon wrote: > Hi Robin, > > On Wed, Sep 28, 2022 at 12:58:52PM +0100, Robin Murphy wrote: > > As with its counterpart, replace copy_to_user() with a new and improved > > version similarly derived from memcpy(). Different tradeoffs from the > > base implementation are made relative to copy_from_user() to get the > > most consistent results across different microarchitectures, but the > > overall shape of the performance gain ends up about the same. > > > > The exception fixups are even more comical this time around, but that's > > down to now needing to reconstruct the faulting address, and cope with > > overlapping stores at various points. Again, improvements to the > > exception mechanism could significantly simplify things in future. > > > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > > --- > > arch/arm64/lib/copy_to_user.S | 391 +++++++++++++++++++++++++++++----- > > 1 file changed, 341 insertions(+), 50 deletions(-) > > Mark reported (off-list) a KASAN splat from Syzkaller on for-next/core: > > | BUG: KASAN: stack-out-of-bounds in _copy_to_iter+0x1084/0x131c FWIW, I was testing with the config framents in: https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=testing/arm64-fnc-20221202 ... building with: usekorg 12.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- defconfig testing.config syzkaller.config kasan.config ... and the full splat was: | ================================================================== | BUG: KASAN: stack-out-of-bounds in _copy_to_iter+0x1084/0x131c | Read of size 8 at addr ffff800017f27cc8 by task syz-executor.1/391 | | CPU: 1 PID: 391 Comm: syz-executor.1 Not tainted 6.1.0-rc4-00108-gc7586d0e4d65 #1 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace.part.0+0x1c8/0x1d4 | show_stack+0x34/0x5c | dump_stack_lvl+0x11c/0x18c | print_report+0x194/0x480 | kasan_report+0xb8/0x250 | __asan_report_load8_noabort+0x28/0x3c | _copy_to_iter+0x1084/0x131c | copy_page_to_iter+0x128/0x840 | pipe_to_user+0xb8/0x160 | __splice_from_pipe+0x358/0x744 | __do_sys_vmsplice+0x3c8/0x8a0 | __arm64_sys_vmsplice+0x94/0xe0 | invoke_syscall+0x8c/0x2d0 | el0_svc_common.constprop.0+0xf4/0x300 | do_el0_svc+0x6c/0x1d4 | el0_svc+0x54/0x120 | el0t_64_sync_handler+0xf4/0x120 | el0t_64_sync+0x190/0x194 | | The buggy address belongs to stack of task syz-executor.1/391 | and is located at offset 488 in frame: | __do_sys_vmsplice+0x0/0x8a0 | | This frame has 7 objects: | [32, 40) 'iov' | [64, 72) 'start' | [96, 136) 'iter' | [176, 216) 'buf' | [256, 312) 'sd' | [352, 480) 'iovstack' | [512, 640) 'pages' | | The buggy address belongs to the virtual mapping at | [ffff800017f20000, ffff800017f29000) created by: | kernel_clone+0x1a4/0xb40 | | The buggy address belongs to the physical page: | page:000000009402e755 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4a38c | memcg:ffff000017b86e02 | flags: 0x3fffc0000000000(node=0|zone=0|lastcpupid=0xffff) | raw: 03fffc0000000000 0000000000000000 dead000000000122 0000000000000000 | raw: 0000000000000000 0000000000000000 00000001ffffffff ffff000017b86e02 | page dumped because: kasan: bad access detected | | Memory state around the buggy address: | ffff800017f27b80: f2 f2 00 00 00 00 00 f2 f2 f2 f2 f2 00 00 00 00 | ffff800017f27c00: 00 00 00 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 | >ffff800017f27c80: 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 00 | ^ | ffff800017f27d00: 00 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 | ffff800017f27d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ================================================================== The reduced reproducer for this was: | r0 = syz_io_uring_setup(0x6bc0, &(0x7f0000000000), &(0x7f0000ffe000/0x1000)=nil, &(0x7f0000002000/0x2000)=nil, &(0x7f0000000080), &(0x7f0000000100)) | mmap$IORING_OFF_SQES(&(0x7f0000002000/0x4000)=nil, 0x4000, 0x0, 0x12, r0, 0x10000000) | pipe2$9p(&(0x7f0000000000)={<r1=>0xffffffffffffffff, <r2=>0xffffffffffffffff}, 0x0) | vmsplice(r2, &(0x7f0000001540)=[{&(0x7f0000000040)="94", 0x1}, {&(0x7f0000000180)="a40cfbddd20ba9161c5596b9f8ab0268dd4bb5a16b47f86c2fcf444faa265e8d73e21b48001c57589092db934915ad208fdcdea67bac381f76ee3bcf6f25f36662e56e905acf80567bcdf60a8a4262805c26076857dc92113c0253ada9ffa5509aaa2f2733eeaab09b4cb28bff530a25c038e919a00803c705c97734473169f0dbb85e818c6a87256c7403480c44418cdfbe15f0d0747b01fd1f311951340b4ee16fdd3b3e5375a766b4240b657a0d1858f7e6c41d632d9256acfff12153f8d9dc90a3fc81f41b230b8d5516d9b855a67154e7ba9eee191152e7ab3fbced27ddf10a58fc8ff1c0652045ef3aeaf1ae8a8354abd6bcf9bee9a0f56cc1bd9c1f9050bb10e14882f12f66a74545033d9ca721f40d1de0ad9ad326b8432ac779e0fe24a8f7ce68d6c3d268d9d430dd57fcb9e1d76747d077ffc0b2157acb0d0d37ee609eb50a20e49586cad91693210ce24cea985abce4138ddf10c5334b1f353682eeceb863c5df1ca0665330030c30971b0f1d312199cb419afa33a36b1043afcc3319f632ab0ed56863f76d2eaec351b5c43b3b9140d430689bd226ed7d9cbdae94a7fbc37e4f57626701e322aaf110b1eb69d4bfdeb20de40dd031efd77a9890d51be497ff7e358905a22ebe71b0fc0566c08f32d3b9468266e6a58729b7f62c83e55716e8e927723bd8cb3fb70c90fb5dc8812818967bfe8fbbb058ad58ed21ce50fe1a55039f8d123c086ca1f78a56981b28be8a2484e52bc3d5a3b773997484076d04c646b57651cf67a4a1cd6aaacae25d926c6bb4c073009baf1e6c6f1dc4fc6b59c43ed99025e3feb446999e2a4817cbed6543e117ca1f4fb43703b78759e5762cc78d2aa935a67f2bf6229e13cda47023aabd4c4b6faba3db2a6abe0bfd2a65323df124d22dd6c7ed80bd643fa912c071dfe42e4c579fc7dd53a229b39a32cb9560c12bac3691ddb02fdf1d1ad49b1052a011619464bbae8c215041aa5b1a9007ae8863a931c05b3043e1a058c8be34849f02453a527dd775efdba1d7bdc4411c9c5adfb2f7692fb2cb6a43d5eb99d6278b64ea24916df21c73c35a726a3b03bf7c8e3de2b3b3c615d3ba714619b5151b7e8396f639112e2ac09ad167e63b18aab3a5478f7281d9ae5b0c1a0ac86cc20bf3df57075dfd0f2491c9ab5cb422dfed832dd0d406623a85a1ccf7f37270d228538328798591a5bc516629eee659f8146f577347bf0f6453f1d3aad97b660357da6160506278d146a5d720c2bcc572d9fa6e90b52444db803226249fe2657235aa3db2d7e7a8a1d5fbb7acc4dd5e1de056746af8256d32e8e936ee134cf5efb0a9eea03b7269ddfde666620be98dffd632bcb1c7e795bebabdb96b50c78d3c29d3ef235ddcd52b6c73f2aed472bff1e093756ba832c0ab853841906613f95682f0cdc37a851d57ac4f8918d77de65d5577e2c65225748f9d0e837445db3ffae17289724ab3e4a39e906b4b4661fd4c1599b30c9b9967d09fdbbc9f3727766eeb63f927ae751808026652399df01f82abf51a5293e0283cc63022c40824354bf884ff2b608fb6bbe8de8feec644708a282bf2103ab68efe46b3dcbcc20a0e178238f33c9e0a94b31cc68a472ee1e86bc99d90d8b220c91d72130f81eba41a58de44af2f03b5c0f4f9eda244e99b4ebf377483cdd7bae65de86e99e2f5dca94b8224388c3517752269eba1bd47f9c322b63dd497d6e1f195d31f857930af59b1aa937f0821728b70e8257bc4697b2dc07a1bf7c15034a4c2ab29d83257e63331d0c4c1c1c248f99dfe2862f42b99df8fb6431b7af90b62ce878c6ca3ed738d39fd66d8762b0dd401b1ed502dd0606b46a86ad532a40a0e14e67f30c413049682ec18deee5f1871ddbf7a7d945dfc69930a4642ac1e18a5cad46daa272436c6460d0fe03ee301d345af898a952f4328f5c786c1e1df1456cc8ea350d5077b5a8c08abc1e9494ce4b8bee08543a3fe22b9cb278082ae20e4430f28e0950f9aa3f45ea9f59735b02fedae3bbf6b360e8408dc0b15843666f04695c6e596b7596e7712ba9cfd53695fa2ae4f9887c84ed4bdbbd9977f4c457cc8133c957b38aefd1afd189e24fa4fb2373aab404736f62133ab1923342d654e0ca30f491ba7ceee626ab7cdcfa9f8d679bd63d69d0fb6031488442fb9e7307ae1873516764120888f38a4e456ec63e85fb5edef778d7a73636cba18a05ec1cc8fbe6b58a1f55d9d9519988e330f79357b88d92e0437d3f57715618387201fc4f716000be3a85652ec83d70abd313c6f51195d8582a40b941bbef0d632645890705433fdd0706e3d0a6850f90c94c3b7f45dfaee963705c86f25bcb85d82b11c84937b7fbd6b196c0e1dc44f2ef8c728e7aa61627e852e1d4fe978fb89785eecf51dd100c7906c597c8cdb377514ed21d82524b4acdb27e497a205cad6b4e267a6568b0b5a92d76657f5ae16368ae1f8988e9e84c45c2119c6b88df05044c36c254b4a2d2cec514e00785b4827eaed5947f9465433b6f68e26d7ee85a5898f56bfe75745cb79b9028638c0d165d15c52e3cc28f25d7004423fcc0c3cba38ddf8bc7f3a671f27e72b1df7ebc23912e931b5b53a2fc21c221e17b1115ade67b33ce1a4600b85eb6a04c63cd72346d01abcc31f79f5367d87e509309f5204c9940bf7de850173f632459182966601403959425da461cd31dcb5e6743d829d17a3d79d748dbe019bcd88b48804aa1776f48fdcff110fdcb9d5ad765560706e469d91fc55558887b54b752b5ec9b7de8a4ba49c7fa7662f0ab86220024ebd168a6ab9af7c75fabc2835b3ed4364ee1422999a2687dbceae58b490317f7531c5566f2408c16f33935f0015776fae25260074b01b610ab5643a70592a0c8ddf7e8ddb33ef0ddfdc0b8d256da0e173ecd90fe932040e80de0d194d76bd60cf401eaa4e25662062a32b0c6d1f40576a313609ea153268c0819d15481ad7361b20f09e61f6d5c5ff55d7900b8588cf7fe003d73d2cfa447afd1b3cef7e8d0b6b910b764afb43f835ef365705161eae14f63e67b2f3c0e4cdfd8eee6a95b14c9a37c4c20dc818adb645562bce72cf7f027091174d1e6076a292f5d491459d160b8ccb363aeba6dc0af4b8330cc1304e5f1da2aab6efab5adad72d924967331b35fc1c371bbbf8cf7d78d79e48c2ca71fb5a52d9415f17a96a0b31c2f0a5d9a9fbc171c63f1d2ea106993ac8c4c3aef6cea4298c7e57cf6923edb9c27c4ffff06a389e85ed69ceec61d146d093ede6064303f36baeb45813c28bb73371bc10a5ab4aa0bb7790898f19da71e580a56c6b7605caffa3e6b404a427004c2533d7cd48bb76fa7610607d95ec6eef644992337f420bddf8a9ef1b19320bce1f325c7655246a932dfc14518a4677d43e75b64edcd16267c82ec86ceb935f3b9a44e22aebcf2e1b5f6e73fb4cec247519878d9d0fd34aaef4c27d52dd78627849e623e0d9e1d745532ec3c182287d60d957817539515ce6f3fd0a1fd241713f1816f2d0a063e636b5adcd315b4ece5fd8dbe9ca81373d55915e199c304ba9b4532f3708f49971f6b1dfd4a4686ba6ff4aa52ddd54123765551b1c3fb1e3520f753b180e673cebc71e7bac24f2aba30538a8aec8a849c2e92a89e9755391dcfe33b4ef34c3598d5e255e30cd0d7fb3ede65fff7594c06d39569d17655ec7332676b124e29d3c5b3a24aba82338388601629da385f36fbceaaa91716b32f926df488f552012fecfe610e7b7cff725bf313bbf13cf622e3354c5320a95309760c5", 0xa80}], 0x2, 0x0) | vmsplice(r1, &(0x7f0000001280)=[{&(0x7f0000001580)="29c7dedc1fc83e848a08d1ab6a6b70a8aba7434e6ad302c1f808d3e154d9a375904104ac349cb5edcd78b744ba6b27cc8f0ad38dda013828b1dcc8ef23a0b0cc865e9a527c9a0bf17f17e433f3700591c5b6a68a430559421b3a960dfb3d8e1a6c3b4144fc1820f4c11f013aaf7591983196aaed77577d52f8423632ce590d28c9f45cb91c745f407ac8af5e75892e589c1b5c73bc0f4646400545073a34214c4d5791711951fc2212cfcabd101ec69f54209c55df4ce1f258a84f979a13d2b01af941b372c111e518022ac74083fc15c3d649dfc3abff6af7d8dbb3e3a9702e3e6a175578176b02c1fbaef7060642f498626feb394e0fa0e6cca23849726911d16c6fb0507172611edd52f28abf5ba517dff8fbae13e8f5356e6a3fe6df0294180dfa82f4c0f8f3c0f297df774e986a48d5d83c6d669722c8e2516220f5f09189459282ab34a733acb7683f7f108bc5997190d527b56efb47e6a46cb3960ba626d3435d817a1ea26959b8a9cb52465dab60efa5d2e81fa8cc07fbbc9550b86163c9d603ae03e7000c50fbb0ce4c0481008892db89dc249bfb15929533f240f1d1a0c8a88e1b836e8ecf9510d10e98f2ef740ce61c885cf00a7aec6f8eee033212cb0d0c7a78a100745453c008502c08e9dfea6456f977956e208102c8ef93a275e4a1a41cbbd964d8f42ff81c729c5742cfc37d33965b07f1b2157e5aa456bb259dcaf9baa775c6d9474e7099eb811db6edefadb4b7a264d4b5c125fe514b710efe9c32ce8ec566e3d643e7771001dc64b72d44e8607032dc2b832a25b0a74db91bf2405fb69bd32fb7ca14ab52570a3074ec20b42ffc71e5bfb8a16582e2b48d784eb09d653af90419920ed7233eb64e6a22ad31833b17ab3d3b0d826ec7d69faf0c234187d6ec20844bb627bc1c476c3756b104e62c99dcd53f1e507f188be09ada0f7595cea3b3df488506398406a7782a0cade85e9ea925dc904d6028052e31b908ff3c7c1d0f0bb49be4208634c8bdb88463f7cb9942ab8621191679f5fc459906d9b63af25e1e7ef68bf7472fbed1585eb3620219691dfc692c3dbfe9aed440dccf2fb96dd7bb6f269d2043abdcc2f3aa469b9ace6e29dfe8846b9b9c17f7a0616fc59cbb66ff7c01a9fa670f4858de60decef9bd3fbf892691e73e40477b0ca55c5463322d8b228a3168a540b029072c8e6f56819fd2ef1486156429769f253d9126901b4e37b24261c127f9948674fcc3855b6f676e5979edd9b857c348ec176d2977fc09431cb35b1ad80fac77ec7a89d9aa9ecf72e66bab174da68cfeb902e5ac1ad8504514bf2fdc45b1fc716f02c4e490168a574bfb1b11dfabd9ead6ab86cc8bc8f861cea8e294bed3651dd78bf11ee548250bc749eb9d81e9fe9fc066884af8d7ea3e5b556e1e98797e8e2022e3fd523508f0b3b2e9cce819cf8779e77937d191e9ff2ece77c530af5ebbba5d642b1fc3e8c32bfe14a1535676e9ce6d23e89b0f22c2e9e739b073857c015cf75be65a0bdbe70145f13c9cdcf4eafcfb1bb733d1b26eb7a5992e3f810f0af8403bb75b55e4430a7d2e8724cbdd7a22c1d4043530c9ee4388ff08331a16664cd9f4348a64d10f5db094ac3336f830b8428b88ec3e46ab56d089efb61c0ef08dd478de5b54288c6d42640367922a9eb84c3024adda569a8792be03217d1b791f6d31e21dff9808e779411190d2294b210d67ffde952f1c65231f5ac82d15fc98072fc1e43605346e5116aa6b5d8372411913c80b06cc71c1ee0608e4ed215550c44db4e3d5ac295cacd486a97813589d96b7b1a105155176baeb6ed29092fa9327d0d7d67ab6c58f7c2f9fca5fe117c3b999e1492fcf9d5e101e9ad61c132e6e3165ee81ddbaa9e403feb2a66fca150735ad95d9b9295ade8c3c396c5bad4d48b05e1ed09cefafd7e3e35818ee5550e7c76ff52432105469a79e7133008dd025695c6ed2a459a8d6c7f49ba44a12b7c7bb9e91eff1634d1106abfbbc7c8fd690961184462bfdcfd1e8b8c730fd971b95e929db1089e04b179eaa094318bc31cb46bb86a64dac45ae0da7222c981d40b6b193c2fd6b374a4b483a690a25589e48e493c26f4604899653fcd7b18b7601140631a7beed7921e8ca6befb917d56fedcbf3be67da78b3dd6c0653af71ecda583570678c331ac0e98c881938a6e9835a627b48c72b5c604c92cbcb19ac7b68a619f45b9fcedc846e7210ed73307cb8a3b5d87f2a6f3a97e74aad9f84d5bbd395f67c5dc56b1b16ff5221962b24feeb38ea32ffcbe1aadc0cc85825533ce22734c2e72fafb99b1d7d4fa9140877a90fae4cf38fb90cf63c527e49f1c8a9ea5b3e4851f77df566439010776820214b6fae22a2f99ccbd86593034348d7b0668bd2ac4d4456f89ab5756a17f8484008905bd1a08430f5cdeac84525ced42092def29f4bf1b7703befceae16da3d43ff6a19aae61d27ce7b94fbdff56da0035511e0d36d7f01b5f15b5a1f42b3f4c07c03d68ce58dfce67ae6634ccdd382e4b68dbc1df0a098bd75ff459b7028d1caeb0c478a178e67253c013a2b9d19ec62f10b90c26259def748c2836cec94f92fa81470d5dbd891a5034bd5841c7b4af7a3bed1b67de385f7e0fb85ed53928da52a0b98e2ab5f32719578c3d4d8ca0f3f7d9c82ef9cd405e0322097f8372bb6e537754f0599bdaa657f2b9aadef7373e98e6dc9482e3dbe88aa554c70191f0ddcd8588da337d2d4389e4a468ec21ed7eb3653a45cb87cb3c626b7f1d03cb0a385fdaf3b30ca1ee145a7729958c330e69e91f058cac7c4b74149369266b758cf62f1b8474e0e1d96101812b6c24abb244ffac625b83cd50e8b1c8d9ba7dbb1e92fd9e6a8f158b3abff84c24413e772b18cdb12ecff68f516dc3191840e5611eb3e1581d477a496ee4403230decbe9cc1ef0aa1beda4c15fa9f70874a7494a83e450039ef2385d19fe381de1580449e206f31bd1ba17d99c60bc6eb164174d69320679089c474e4befdaabd688c4102047a808340094a38934d327f044ced1070f9c3d7fffe3863ff61417fb263c351d7324ddd8e4361398c87eaed9eab9a63a4f0a23122a1c758acf81f91a0f9109df4c8af6d4f0900fc3ef04f9e9c6825ee42d6d76e3ed75b845e5da333804af68fed85ad3271235f7ad87d6325c1426dcca790ae3f3c4c5980d1ae7d9cb5dc577f02babe2f0198468c12fd54a9222e2a6bdc70bea9e9c8b1d9a1e7f0b090dfefb36250bbac93f636aeac45b4f92d5b0b2fb3b91d36e48c8499ebe4c7f9d7aadc2a327e1c08201f695b072933d1d223c7fbe0c059244ca4f874e475336c6740e235ab9457e00c5da38686903edbfd5834385e9c730749002e798040f12a37e415f6ba49ad23ae79650ca233fcef58d8b412cdef87c70d5897b25dcf4f6caaa266da726104549a6158165b4724c5ec9ef7bc807f46beb2691cf3de90d2bf72ba26e55a47ae74e8a0854c4f8cf4a57ef38b9e431fd34850fa69336ed46912727532352ebf73595edaab6065917d3a2bb775ac7424fafd380be00adf58a8fdd2b0b6e1b536fd77876a8a18097b03ca0ccefb655bda134bd2386b1bc4b1489a38e54d025612c36305a245649fc89fe281cd8f93395f00f6dc5e1417f353b84d52f08cf2ce44a97601ab513cdb351a5a608e03ac8c233764572e3c006c203cea941cdeec4bda877edc887a07ff0003cf1342f1cc64857bffd264104f850a5382efe5cbe5c439af6007b2f64555b1d276388ef50134d0ca819b52105351a5570adc17757601d890", 0xa81}], 0x1, 0x0) ... which blew up atop for-next/uaccess, but was fine atop v6.1-rc4. A bisect ended with: c1dd028491d18d88d4ece4e5ad8473440a68a6b1 is the first bad commit i.e. c1dd028491d18d88 ("arm64: Update copy_to_user()") ... which seesm to agree with Will's analysis below. > We've not had any reports of this before (or on other branches) and Mark > is unable to reproduce the problem on mainline. I suggested trying > for-next/uaccess and he confirmed that the problem has been introduced > there. Hopefully he can chime in with more details. > > In the meantime, I think it's safest to drop this branch for now considering > where we are in the release cycle. I'll leave it pushed to the arm64 git, > so we can put fixes on top and try to merge it next time around instead. > > One thing I _did_ notice from a quick skim of the code is: > > > + /* Fixups... */ > > + > > + /* > > + * Fault on the first write, but progress may have been possible; > > + * realign dst and retry a single byte to confirm. > > + */ > > +L(fixup_pre): > > + mov dst, dstin > > +U_dst( ldtrb A_lw, [src]) > > + strb A_lw, [dst], #1 > > +L(fixup_dst): > > + sub x0, dstend, dst > > + ret > > Why is U_dst() being applied to the load for a copy_to_user() here? I > would've expected to annotate the store. Ignoring the annotation, that's an LDTRB (an *unprivileged* load) feeding into an STRB (a *privileged* store), which is definitely the wrote way around for copy_*to*_user(). I think this is a copy-paste mixup with the code in copy_from_user(), since the two are identical here: # copy_from_user() | /* | * Fault before anything has been written, but progress may have | * been possible; realign dst and retry a single byte to confirm. | */ | L(fixup_pre): | mov dst, dstin | U_dst( ldtrb A_lw, [src]) | strb A_lw, [dst], #1 | L(fixup_dst): | sub x0, dstend, dst | ret # copy_to_user() | /* | * Fault on the first write, but progress may have been possible; | * realign dst and retry a single byte to confirm. | */ | L(fixup_pre): | mov dst, dstin | U_dst( ldtrb A_lw, [src]) | strb A_lw, [dst], #1 | L(fixup_dst): | sub x0, dstend, dst | ret Thanks, Mark.
On 2022-12-05 16:46, Mark Rutland wrote: > On Mon, Dec 05, 2022 at 04:09:10PM +0000, Will Deacon wrote: >> Hi Robin, >> >> On Wed, Sep 28, 2022 at 12:58:52PM +0100, Robin Murphy wrote: >>> As with its counterpart, replace copy_to_user() with a new and improved >>> version similarly derived from memcpy(). Different tradeoffs from the >>> base implementation are made relative to copy_from_user() to get the >>> most consistent results across different microarchitectures, but the >>> overall shape of the performance gain ends up about the same. >>> >>> The exception fixups are even more comical this time around, but that's >>> down to now needing to reconstruct the faulting address, and cope with >>> overlapping stores at various points. Again, improvements to the >>> exception mechanism could significantly simplify things in future. >>> >>> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >>> --- >>> arch/arm64/lib/copy_to_user.S | 391 +++++++++++++++++++++++++++++----- >>> 1 file changed, 341 insertions(+), 50 deletions(-) >> >> Mark reported (off-list) a KASAN splat from Syzkaller on for-next/core: >> >> | BUG: KASAN: stack-out-of-bounds in _copy_to_iter+0x1084/0x131c > > FWIW, I was testing with the config framents in: > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=testing/arm64-fnc-20221202 > > ... building with: > > usekorg 12.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- defconfig testing.config syzkaller.config kasan.config > > ... and the full splat was: > > | ================================================================== > | BUG: KASAN: stack-out-of-bounds in _copy_to_iter+0x1084/0x131c > | Read of size 8 at addr ffff800017f27cc8 by task syz-executor.1/391 > | > | CPU: 1 PID: 391 Comm: syz-executor.1 Not tainted 6.1.0-rc4-00108-gc7586d0e4d65 #1 > | Hardware name: linux,dummy-virt (DT) > | Call trace: > | dump_backtrace.part.0+0x1c8/0x1d4 > | show_stack+0x34/0x5c > | dump_stack_lvl+0x11c/0x18c > | print_report+0x194/0x480 > | kasan_report+0xb8/0x250 > | __asan_report_load8_noabort+0x28/0x3c > | _copy_to_iter+0x1084/0x131c > | copy_page_to_iter+0x128/0x840 > | pipe_to_user+0xb8/0x160 > | __splice_from_pipe+0x358/0x744 > | __do_sys_vmsplice+0x3c8/0x8a0 > | __arm64_sys_vmsplice+0x94/0xe0 > | invoke_syscall+0x8c/0x2d0 > | el0_svc_common.constprop.0+0xf4/0x300 > | do_el0_svc+0x6c/0x1d4 > | el0_svc+0x54/0x120 > | el0t_64_sync_handler+0xf4/0x120 > | el0t_64_sync+0x190/0x194 > | > | The buggy address belongs to stack of task syz-executor.1/391 > | and is located at offset 488 in frame: > | __do_sys_vmsplice+0x0/0x8a0 > | > | This frame has 7 objects: > | [32, 40) 'iov' > | [64, 72) 'start' > | [96, 136) 'iter' > | [176, 216) 'buf' > | [256, 312) 'sd' > | [352, 480) 'iovstack' > | [512, 640) 'pages' > | > | The buggy address belongs to the virtual mapping at > | [ffff800017f20000, ffff800017f29000) created by: > | kernel_clone+0x1a4/0xb40 > | > | The buggy address belongs to the physical page: > | page:000000009402e755 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4a38c > | memcg:ffff000017b86e02 > | flags: 0x3fffc0000000000(node=0|zone=0|lastcpupid=0xffff) > | raw: 03fffc0000000000 0000000000000000 dead000000000122 0000000000000000 > | raw: 0000000000000000 0000000000000000 00000001ffffffff ffff000017b86e02 > | page dumped because: kasan: bad access detected > | > | Memory state around the buggy address: > | ffff800017f27b80: f2 f2 00 00 00 00 00 f2 f2 f2 f2 f2 00 00 00 00 > | ffff800017f27c00: 00 00 00 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 > | >ffff800017f27c80: 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 00 > | ^ > | ffff800017f27d00: 00 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 > | ffff800017f27d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > | ================================================================== > > The reduced reproducer for this was: > > | r0 = syz_io_uring_setup(0x6bc0, &(0x7f0000000000), &(0x7f0000ffe000/0x1000)=nil, &(0x7f0000002000/0x2000)=nil, &(0x7f0000000080), &(0x7f0000000100)) > | mmap$IORING_OFF_SQES(&(0x7f0000002000/0x4000)=nil, 0x4000, 0x0, 0x12, r0, 0x10000000) > | pipe2$9p(&(0x7f0000000000)={<r1=>0xffffffffffffffff, <r2=>0xffffffffffffffff}, 0x0) > | vmsplice(r2, &(0x7f0000001540)=[{&(0x7f0000000040)="94", 0x1}, {&(0x7f0000000180)="a40cfbddd20ba9161c5596b9f8ab0268dd4bb5a16b47f86c2fcf444faa265e8d73e21b48001c57589092db934915ad208fdcdea67bac381f76ee3bcf6f25f36662e56e905acf80567bcdf60a8a4262805c26076857dc92113c0253ada9ffa5509aaa2f2733eeaab09b4cb28bff530a25c038e919a00803c705c97734473169f0dbb85e818c6a87256c7403480c44418cdfbe15f0d0747b01fd1f311951340b4ee16fdd3b3e5375a766b4240b657a0d1858f7e6c41d632d9256acfff12153f8d9dc90a3fc81f41b230b8d5516d9b855a67154e7ba9eee191152e7ab3fbced27ddf10a58fc8ff1c0652045ef3aeaf1ae8a8354abd6bcf9bee9a0f56cc1bd9c1f9050bb10e14882f12f66a74545033d9ca721f40d1de0ad9ad326b8432ac779e0fe24a8f7ce68d6c3d268d9d430dd57fcb9e1d76747d077ffc0b2157acb0d0d37ee609eb50a20e49586cad91693210ce24cea985abce4138ddf10c5334b1f353682eeceb863c5df1ca0665330030c30971b0f1d312199cb419afa33a36b1043afcc3319f632ab0ed56863f76d2eaec351b5c43b3b9140d430689bd226ed7d9cbdae94a7fbc37e4f57626701e322aaf110b1eb69d4bfdeb20de40dd031efd77a9890d51be497ff7e358905a22ebe71b0fc0566c08f32d3b9468266e6a58729b7f62c83e55716e8e927723bd8cb3fb70c90fb5dc8812818967bfe8fbbb058ad58ed21ce50fe1a55039f8d123c086ca1f78a56981b28be8a2484e52bc3d5a3b773997484076d04c646b57651cf67a4a1cd6aaacae25d926c6bb4c073009baf1e6c6f1dc4fc6b59c43ed99025e3feb446999e2a4817cbed6543e117ca1f4fb43703b78759e5762cc78d2aa935a67f2bf6229e13cda47023aabd4c4b6faba3db2a6abe0bfd2a65323df124d22dd6c7ed80bd643fa912c071dfe42e4c579fc7dd53a229b39a32cb9560c12bac3691ddb02fdf1d1ad49b1052a011619464bbae8c215041aa5b1a9007ae8863a931c05b3043e1a058c8be34849f02453a527dd775efdba1d7bdc4411c9c5adfb2f7692fb2cb6a43d5eb99d6278b64ea24916df21c73c35a726a3b03bf7c8e3de2b3b3c615d3ba714619b5151b7e8396f639112e2ac09ad167e63b18aab3a5478f7281d9ae5b0c1a0ac86cc20bf3df57075dfd0f2491c9ab5cb422dfed832dd0d406623a85a1ccf7f37270d228538328798591a5bc516629eee659f8146f577347bf0f6453f1d3aad97b660357da6160506278d146a5d720c2bcc572d9fa6e90b52444db803226249fe2657235aa3db2d7e7a8a1d5fbb7acc4dd5e1de056746af8256d32e8e936ee134cf5efb0a9eea03b7269ddfde666620be98dffd632bcb1c7e795bebabdb96b50c78d3c29d3ef235ddcd52b6c73f2aed472bff1e093756ba832c0ab853841906613f95682f0cdc37a851d57ac4f8918d77de65d5577e2c65225748f9d0e837445db3ffae17289724ab3e4a39e906b4b4661fd4c1599b30c9b9967d09fdbbc9f3727766eeb63f927ae751808026652399df01f82abf51a5293e0283cc63022c40824354bf884ff2b608fb6bbe8de8feec644708a282bf2103ab68efe46b3dcbcc20a0e178238f33c9e0a94b31cc68a472ee1e86bc99d90d8b220c91d72130f81eba41a58de44af2f03b5c0f4f9eda244e99b4ebf377483cdd7bae65de86e99e2f5dca94b8224388c3517752269eba1bd47f9c322b63dd497d6e1f195d31f857930af59b1aa937f0821728b70e8257bc4697b2dc07a1bf7c15034a4c2ab29d83257e63331d0c4c1c1c248f99dfe2862f42b99df8fb6431b7af90b62ce878c6ca3ed738d39fd66d8762b0dd401b1ed502dd0606b46a86ad532a40a0e14e67f30c413049682ec18deee5f1871ddbf7a7d945dfc69930a4642ac1e18a5cad46daa272436c6460d0fe03ee301d345af898a952f4328f5c786c1e1df1456cc8ea350d5077b5a8c08abc1e9494ce4b8bee08543a3fe22b9cb278082ae20e4430f28e0950f9aa3f45ea9f59735b02fedae3bbf6b360e8408dc0b15843666f04695c6e596b7596e7712ba9cfd53695fa2ae4f9887c84ed4bdbbd9977f4c457cc8133c957b38aefd1afd189e24fa4fb2373aab404736f62133ab1923342d654e0ca30f491ba7ceee626ab7cdcfa9f8d679bd63d69d0fb6031488442fb9e7307ae1873516764120888f38a4e456ec63e85fb5edef778d7a73636cba18a05ec1cc8fbe6b58a1f55d9d9519988e330f79357b88d92e0437d3f57715618387201fc4f716000be3a85652ec83d70abd313c6f51195d8582a40b941bbef0d632645890705433fdd0706e3d0a6850f90c94c3b7f45dfaee963705c86f25bcb85d82b11c84937b7fbd6b196c0e1dc44f2ef8c728e7aa61627e852e1d4fe978fb89785eecf51dd100c7906c597c8cdb377514ed21d82524b4acdb27e497a205cad6b4e267a6568b0b5a92d76657f5ae16368ae1f8988e9e84c45c2119c6b88df05044c36c254b4a2d2cec514e00785b4827eaed5947f9465433b6f68e26d7ee85a5898f56bfe75745cb79b9028638c0d165d15c52e3cc28f25d7004423fcc0c3cba38ddf8bc7f3a671f27e72b1df7ebc23912e931b5b53a2fc21c221e17b1115ade67b33ce1a4600b85eb6a04c63cd72346d01abcc31f79f5367d87e509309f5204c9940bf7de850173f632459182966601403959425da461cd31dcb5e6743d829d17a3d79d748dbe019bcd88b48804aa1776f48fdcff110fdcb9d5ad765560706e469d91fc55558887b54b752b5ec9b7de8a4ba49c7fa7662f0ab86220024ebd168a6ab9af7c75fabc2835b3ed4364ee1422999a2687dbceae58b490317f7531c5566f2408c16f33935f0015776fae25260074b01b610ab5643a70592a0c8ddf7e8ddb33ef0ddfdc0b8d256da0e173ecd90fe932040e80de0d194d76bd60cf401eaa4e25662062a32b0c6d1f40576a313609ea153268c0819d15481ad7361b20f09e61f6d5c5ff55d7900b8588cf7fe003d73d2cfa447afd1b3cef7e8d0b6b910b764afb43f835ef365705161eae14f63e67b2f3c0e4cdfd8eee6a95b14c9a37c4c20dc818adb645562bce72cf7f027091174d1e6076a292f5d491459d160b8ccb363aeba6dc0af4b8330cc1304e5f1da2aab6efab5adad72d924967331b35fc1c371bbbf8cf7d78d79e48c2ca71fb5a52d9415f17a96a0b31c2f0a5d9a9fbc171c63f1d2ea106993ac8c4c3aef6cea4298c7e57cf6923edb9c27c4ffff06a389e85ed69ceec61d146d093ede6064303f36baeb45813c28bb73371bc10a5ab4aa0bb7790898f19da71e580a56c6b7605caffa3e6b404a427004c2533d7cd48bb76fa7610607d95ec6eef644992337f420bddf8a9ef1b19320bce1f325c7655246a932dfc14518a4677d43e75b64edcd16267c82ec86ceb935f3b9a44e22aebcf2e1b5f6e73fb4cec247519878d9d0fd34aaef4c27d52dd78627849e623e0d9e1d745532ec3c182287d60d957817539515ce6f3fd0a1fd241713f1816f2d0a063e636b5adcd315b4ece5fd8dbe9ca81373d55915e199c304ba9b4532f3708f49971f6b1dfd4a4686ba6ff4aa52ddd54123765551b1c3fb1e3520f753b180e673cebc71e7bac24f2aba30538a8aec8a849c2e92a89e9755391dcfe33b4ef34c3598d5e255e30cd0d7fb3ede65fff7594c06d39569d17655ec7332676b124e29d3c5b3a24aba82338388601629da385f36fbceaaa91716b32f926df488f552012fecfe610e7b7cff725bf313bbf13cf622e3354c5320a95309760c5", 0xa80}], 0x2, 0x0) > | vmsplice(r1, &(0x7f0000001280)=[{&(0x7f0000001580)="29c7dedc1fc83e848a08d1ab6a6b70a8aba7434e6ad302c1f808d3e154d9a375904104ac349cb5edcd78b744ba6b27cc8f0ad38dda013828b1dcc8ef23a0b0cc865e9a527c9a0bf17f17e433f3700591c5b6a68a430559421b3a960dfb3d8e1a6c3b4144fc1820f4c11f013aaf7591983196aaed77577d52f8423632ce590d28c9f45cb91c745f407ac8af5e75892e589c1b5c73bc0f4646400545073a34214c4d5791711951fc2212cfcabd101ec69f54209c55df4ce1f258a84f979a13d2b01af941b372c111e518022ac74083fc15c3d649dfc3abff6af7d8dbb3e3a9702e3e6a175578176b02c1fbaef7060642f498626feb394e0fa0e6cca23849726911d16c6fb0507172611edd52f28abf5ba517dff8fbae13e8f5356e6a3fe6df0294180dfa82f4c0f8f3c0f297df774e986a48d5d83c6d669722c8e2516220f5f09189459282ab34a733acb7683f7f108bc5997190d527b56efb47e6a46cb3960ba626d3435d817a1ea26959b8a9cb52465dab60efa5d2e81fa8cc07fbbc9550b86163c9d603ae03e7000c50fbb0ce4c0481008892db89dc249bfb15929533f240f1d1a0c8a88e1b836e8ecf9510d10e98f2ef740ce61c885cf00a7aec6f8eee033212cb0d0c7a78a100745453c008502c08e9dfea6456f977956e208102c8ef93a275e4a1a41cbbd964d8f42ff81c729c5742cfc37d33965b07f1b2157e5aa456bb259dcaf9baa775c6d9474e7099eb811db6edefadb4b7a264d4b5c125fe514b710efe9c32ce8ec566e3d643e7771001dc64b72d44e8607032dc2b832a25b0a74db91bf2405fb69bd32fb7ca14ab52570a3074ec20b42ffc71e5bfb8a16582e2b48d784eb09d653af90419920ed7233eb64e6a22ad31833b17ab3d3b0d826ec7d69faf0c234187d6ec20844bb627bc1c476c3756b104e62c99dcd53f1e507f188be09ada0f7595cea3b3df488506398406a7782a0cade85e9ea925dc904d6028052e31b908ff3c7c1d0f0bb49be4208634c8bdb88463f7cb9942ab8621191679f5fc459906d9b63af25e1e7ef68bf7472fbed1585eb3620219691dfc692c3dbfe9aed440dccf2fb96dd7bb6f269d2043abdcc2f3aa469b9ace6e29dfe8846b9b9c17f7a0616fc59cbb66ff7c01a9fa670f4858de60decef9bd3fbf892691e73e40477b0ca55c5463322d8b228a3168a540b029072c8e6f56819fd2ef1486156429769f253d9126901b4e37b24261c127f9948674fcc3855b6f676e5979edd9b857c348ec176d2977fc09431cb35b1ad80fac77ec7a89d9aa9ecf72e66bab174da68cfeb902e5ac1ad8504514bf2fdc45b1fc716f02c4e490168a574bfb1b11dfabd9ead6ab86cc8bc8f861cea8e294bed3651dd78bf11ee548250bc749eb9d81e9fe9fc066884af8d7ea3e5b556e1e98797e8e2022e3fd523508f0b3b2e9cce819cf8779e77937d191e9ff2ece77c530af5ebbba5d642b1fc3e8c32bfe14a1535676e9ce6d23e89b0f22c2e9e739b073857c015cf75be65a0bdbe70145f13c9cdcf4eafcfb1bb733d1b26eb7a5992e3f810f0af8403bb75b55e4430a7d2e8724cbdd7a22c1d4043530c9ee4388ff08331a16664cd9f4348a64d10f5db094ac3336f830b8428b88ec3e46ab56d089efb61c0ef08dd478de5b54288c6d42640367922a9eb84c3024adda569a8792be03217d1b791f6d31e21dff9808e779411190d2294b210d67ffde952f1c65231f5ac82d15fc98072fc1e43605346e5116aa6b5d8372411913c80b06cc71c1ee0608e4ed215550c44db4e3d5ac295cacd486a97813589d96b7b1a105155176baeb6ed29092fa9327d0d7d67ab6c58f7c2f9fca5fe117c3b999e1492fcf9d5e101e9ad61c132e6e3165ee81ddbaa9e403feb2a66fca150735ad95d9b9295ade8c3c396c5bad4d48b05e1ed09cefafd7e3e35818ee5550e7c76ff52432105469a79e7133008dd025695c6ed2a459a8d6c7f49ba44a12b7c7bb9e91eff1634d1106abfbbc7c8fd690961184462bfdcfd1e8b8c730fd971b95e929db1089e04b179eaa094318bc31cb46bb86a64dac45ae0da7222c981d40b6b193c2fd6b374a4b483a690a25589e48e493c26f4604899653fcd7b18b7601140631a7beed7921e8ca6befb917d56fedcbf3be67da78b3dd6c0653af71ecda583570678c331ac0e98c881938a6e9835a627b48c72b5c604c92cbcb19ac7b68a619f45b9fcedc846e7210ed73307cb8a3b5d87f2a6f3a97e74aad9f84d5bbd395f67c5dc56b1b16ff5221962b24feeb38ea32ffcbe1aadc0cc85825533ce22734c2e72fafb99b1d7d4fa9140877a90fae4cf38fb90cf63c527e49f1c8a9ea5b3e4851f77df566439010776820214b6fae22a2f99ccbd86593034348d7b0668bd2ac4d4456f89ab5756a17f8484008905bd1a08430f5cdeac84525ced42092def29f4bf1b7703befceae16da3d43ff6a19aae61d27ce7b94fbdff56da0035511e0d36d7f01b5f15b5a1f42b3f4c07c03d68ce58dfce67ae6634ccdd382e4b68dbc1df0a098bd75ff459b7028d1caeb0c478a178e67253c013a2b9d19ec62f10b90c26259def748c2836cec94f92fa81470d5dbd891a5034bd5841c7b4af7a3bed1b67de385f7e0fb85ed53928da52a0b98e2ab5f32719578c3d4d8ca0f3f7d9c82ef9cd405e0322097f8372bb6e537754f0599bdaa657f2b9aadef7373e98e6dc9482e3dbe88aa554c70191f0ddcd8588da337d2d4389e4a468ec21ed7eb3653a45cb87cb3c626b7f1d03cb0a385fdaf3b30ca1ee145a7729958c330e69e91f058cac7c4b74149369266b758cf62f1b8474e0e1d96101812b6c24abb244ffac625b83cd50e8b1c8d9ba7dbb1e92fd9e6a8f158b3abff84c24413e772b18cdb12ecff68f516dc3191840e5611eb3e1581d477a496ee4403230decbe9cc1ef0aa1beda4c15fa9f70874a7494a83e450039ef2385d19fe381de1580449e206f31bd1ba17d99c60bc6eb164174d69320679089c474e4befdaabd688c4102047a808340094a38934d327f044ced1070f9c3d7fffe3863ff61417fb263c351d7324ddd8e4361398c87eaed9eab9a63a4f0a23122a1c758acf81f91a0f9109df4c8af6d4f0900fc3ef04f9e9c6825ee42d6d76e3ed75b845e5da333804af68fed85ad3271235f7ad87d6325c1426dcca790ae3f3c4c5980d1ae7d9cb5dc577f02babe2f0198468c12fd54a9222e2a6bdc70bea9e9c8b1d9a1e7f0b090dfefb36250bbac93f636aeac45b4f92d5b0b2fb3b91d36e48c8499ebe4c7f9d7aadc2a327e1c08201f695b072933d1d223c7fbe0c059244ca4f874e475336c6740e235ab9457e00c5da38686903edbfd5834385e9c730749002e798040f12a37e415f6ba49ad23ae79650ca233fcef58d8b412cdef87c70d5897b25dcf4f6caaa266da726104549a6158165b4724c5ec9ef7bc807f46beb2691cf3de90d2bf72ba26e55a47ae74e8a0854c4f8cf4a57ef38b9e431fd34850fa69336ed46912727532352ebf73595edaab6065917d3a2bb775ac7424fafd380be00adf58a8fdd2b0b6e1b536fd77876a8a18097b03ca0ccefb655bda134bd2386b1bc4b1489a38e54d025612c36305a245649fc89fe281cd8f93395f00f6dc5e1417f353b84d52f08cf2ce44a97601ab513cdb351a5a608e03ac8c233764572e3c006c203cea941cdeec4bda877edc887a07ff0003cf1342f1cc64857bffd264104f850a5382efe5cbe5c439af6007b2f64555b1d276388ef50134d0ca819b52105351a5570adc17757601d890", 0xa81}], 0x1, 0x0) > > ... which blew up atop for-next/uaccess, but was fine atop v6.1-rc4. > > A bisect ended with: > > c1dd028491d18d88d4ece4e5ad8473440a68a6b1 is the first bad commit > > i.e. > > c1dd028491d18d88 ("arm64: Update copy_to_user()") > > ... which seesm to agree with Will's analysis below. > >> We've not had any reports of this before (or on other branches) and Mark >> is unable to reproduce the problem on mainline. I suggested trying >> for-next/uaccess and he confirmed that the problem has been introduced >> there. Hopefully he can chime in with more details. >> >> In the meantime, I think it's safest to drop this branch for now considering >> where we are in the release cycle. I'll leave it pushed to the arm64 git, >> so we can put fixes on top and try to merge it next time around instead. >> >> One thing I _did_ notice from a quick skim of the code is: >> >>> + /* Fixups... */ >>> + >>> + /* >>> + * Fault on the first write, but progress may have been possible; >>> + * realign dst and retry a single byte to confirm. >>> + */ >>> +L(fixup_pre): >>> + mov dst, dstin >>> +U_dst( ldtrb A_lw, [src]) >>> + strb A_lw, [dst], #1 >>> +L(fixup_dst): >>> + sub x0, dstend, dst >>> + ret >> >> Why is U_dst() being applied to the load for a copy_to_user() here? I >> would've expected to annotate the store. > > Ignoring the annotation, that's an LDTRB (an *unprivileged* load) feeding into > an STRB (a *privileged* store), which is definitely the wrote way around for > copy_*to*_user(). > > I think this is a copy-paste mixup with the code in copy_from_user(), since the > two are identical here: Indeed, I think one of the very last things I did was to rename and reorder a load of the fixup labels for consistency, and it does appear that I inadvertently made things a bit *too* consistent here :( [ FWIW "fixup_dst" refers to the address in the dst register being correctly adjusted for the final return calculation, so "U_dst()" just marks a fault which doesn't need any pre-processing. Prior to that I was starting to confuse myself whether fixup_0 meant 0 offset or 0 bytes copied... ] Sadly I don't see how this little mishap alone could explain the KASAN report, though - seems like that's going to need more digging, so given the timescale I'm happy if you'd rather play it safe this time around. Thanks, Robin. > > # copy_from_user() > | /* > | * Fault before anything has been written, but progress may have > | * been possible; realign dst and retry a single byte to confirm. > | */ > | L(fixup_pre): > | mov dst, dstin > | U_dst( ldtrb A_lw, [src]) > | strb A_lw, [dst], #1 > | L(fixup_dst): > | sub x0, dstend, dst > | ret > > # copy_to_user() > | /* > | * Fault on the first write, but progress may have been possible; > | * realign dst and retry a single byte to confirm. > | */ > | L(fixup_pre): > | mov dst, dstin > | U_dst( ldtrb A_lw, [src]) > | strb A_lw, [dst], #1 > | L(fixup_dst): > | sub x0, dstend, dst > | ret > > Thanks, > Mark.
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S index 802231772608..b641f00f50d6 100644 --- a/arch/arm64/lib/copy_to_user.S +++ b/arch/arm64/lib/copy_to_user.S @@ -1,73 +1,364 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * Copyright (C) 2012 ARM Ltd. + * Copyright (c) 2012-2022, Arm Limited. */ #include <linux/linkage.h> #include <asm/asm-uaccess.h> #include <asm/assembler.h> -#include <asm/cache.h> + +/* Assumptions: + * + * ARMv8-a, AArch64, unaligned accesses. + * + */ + +#define L(label) .L ## label + +#define dstin x0 +#define src x1 +#define count x2 +#define dst x3 +#define srcend x4 +#define dstend x5 +#define A_l x6 +#define A_lw w6 +#define A_h x7 +#define B_l x8 +#define B_lw w8 +#define B_h x9 +#define C_l x10 +#define C_lw w10 +#define C_h x11 +#define D_l x12 +#define D_h x13 +#define E_l x14 +#define E_h x15 +#define F_l x16 +#define F_h x17 +#define tmp1 x14 /* - * Copy to user space from a kernel buffer (alignment handled by the hardware) + * Derived from memcpy with various adjustments: * - * Parameters: - * x0 - to - * x1 - from - * x2 - n - * Returns: - * x0 - bytes not copied + * - memmove parts are removed since user and kernel pointers won't overlap. + * - Contrary to copy_from_user, although big cores still aren't particularly + * keen on the increased instruction count in the main loop, processing fewer + * than 64 bytes per iteration here hurts little cores more. + * - The medium-size cases are reworked to better balance the loads with the + * doubled-up store ops, avoid potential out-of-sequence faults, and preserve + * the input arguments for the sake of fault handling. + * - The 0-3 byte sequence is replaced with the one borrowed from clear_user, + * since STTRB lacks a register-offset addressing mode. */ - .macro ldrb1 reg, ptr, val - ldrb \reg, [\ptr], \val - .endm - .macro strb1 reg, ptr, val - user_ldst 9998f, sttrb, \reg, \ptr, \val - .endm +#define U_pre(x...) USER(L(fixup_pre), x) +#define U_dst(x...) USER(L(fixup_dst), x) - .macro ldrh1 reg, ptr, val - ldrh \reg, [\ptr], \val - .endm +#define U_S1(x...) USER(L(fixup_s1), x) +#define U_S4(x...) USER(L(fixup_s4), x) +#define U_S8(x...) USER(L(fixup_s8), x) +#define U_ST8(x...) USER(L(fixup_st8), x) +#define U_S16(x...) USER(L(fixup_s16), x) +#define U_M24(x...) USER(L(fixup_m24), x) +#define U_M32(x...) USER(L(fixup_m32), x) +#define U_M40(x...) USER(L(fixup_m40), x) +#define U_M48(x...) USER(L(fixup_m48), x) +#define U_M56(x...) USER(L(fixup_m56), x) +#define U_M64(x...) USER(L(fixup_m64), x) +#define U_MT8(x...) USER(L(fixup_mt8), x) +#define U_MT16(x...) USER(L(fixup_mt16), x) +#define U_MT24(x...) USER(L(fixup_mt24), x) +#define U_MT32(x...) USER(L(fixup_mt32), x) +#define U_MT40(x...) USER(L(fixup_mt40), x) +#define U_MT48(x...) USER(L(fixup_mt48), x) +#define U_MT56(x...) USER(L(fixup_mt56), x) - .macro strh1 reg, ptr, val - user_ldst 9997f, sttrh, \reg, \ptr, \val - .endm +#define U_L16(x...) USER(L(fixup_l16), x) +#define U_L24(x...) USER(L(fixup_l24), x) +#define U_L32(x...) USER(L(fixup_l32), x) +#define U_L40(x...) USER(L(fixup_l40), x) +#define U_L48(x...) USER(L(fixup_l48), x) +#define U_L56(x...) USER(L(fixup_l56), x) +#define U_L64(x...) USER(L(fixup_l64), x) +#define U_L72(x...) USER(L(fixup_l72), x) +#define U_LT8(x...) USER(L(fixup_lt8), x) +#define U_LT16(x...) USER(L(fixup_lt16), x) +#define U_LT24(x...) USER(L(fixup_lt24), x) +#define U_LT32(x...) USER(L(fixup_lt32), x) +#define U_LT40(x...) USER(L(fixup_lt40), x) +#define U_LT48(x...) USER(L(fixup_lt48), x) +#define U_LT56(x...) USER(L(fixup_lt56), x) +#define U_LT64(x...) USER(L(fixup_lt64), x) - .macro ldr1 reg, ptr, val - ldr \reg, [\ptr], \val - .endm - - .macro str1 reg, ptr, val - user_ldst 9997f, sttr, \reg, \ptr, \val - .endm - - .macro ldp1 reg1, reg2, ptr, val - ldp \reg1, \reg2, [\ptr], \val - .endm - - .macro stp1 reg1, reg2, ptr, val - user_stp 9997f, \reg1, \reg2, \ptr, \val - .endm - -end .req x5 -srcin .req x15 SYM_FUNC_START(__arch_copy_to_user) - add end, x0, x2 - mov srcin, x1 -#include "copy_template.S" + add srcend, src, count + add dstend, dstin, count + cmp count, 128 + b.hi L(copy_long) + cmp count, 32 + b.hi L(copy32_128) + + /* Small copies: 0..32 bytes. */ + cmp count, 16 + b.lo L(copy16) + ldp A_l, A_h, [src] + ldp D_l, D_h, [srcend, -16] +U_pre( sttr A_l, [dstin]) +U_S8( sttr A_h, [dstin, 8]) +U_S16( sttr D_l, [dstend, -16]) +U_ST8( sttr D_h, [dstend, -8]) mov x0, #0 ret - // Exception fixups -9997: cmp dst, dstin - b.ne 9998f - // Before being absolutely sure we couldn't copy anything, try harder - ldrb tmp1w, [srcin] -USER(9998f, sttrb tmp1w, [dst]) - add dst, dst, #1 -9998: sub x0, end, dst // bytes not copied + /* Copy 8-15 bytes. */ +L(copy16): + tbz count, 3, L(copy8) + ldr A_l, [src] + ldr A_h, [srcend, -8] +U_pre( sttr A_l, [dstin]) +U_S8( sttr A_h, [dstend, -8]) + mov x0, #0 ret + + .p2align 3 + /* Copy 4-7 bytes. */ +L(copy8): + tbz count, 2, L(copy4) + ldr A_lw, [src] + ldr B_lw, [srcend, -4] +U_pre( sttr A_lw, [dstin]) +U_S4( sttr B_lw, [dstend, -4]) + mov x0, #0 + ret + + /* Copy 0..3 bytes. */ +L(copy4): + tbz count, #1, L(copy1) + ldrh A_lw, [src] +U_pre( sttrh A_lw, [dstin]) +L(copy1): + tbz count, #0, L(copy0) + ldrb A_lw, [srcend, -1] +U_S1( sttrb A_lw, [dstend, -1]) +L(copy0): + mov x0, #0 + ret + + .p2align 4 + /* Medium copies: 33..128 bytes. */ +L(copy32_128): + ldp A_l, A_h, [src] + ldp B_l, B_h, [src, 16] +U_pre( sttr A_l, [dstin]) +U_S8( sttr A_h, [dstin, 8]) +U_S16( sttr B_l, [dstin, 16]) +U_M24( sttr B_h, [dstin, 24]) + ldp C_l, C_h, [srcend, -32] + ldp D_l, D_h, [srcend, -16] + cmp count, 64 + b.ls L(copy64) + ldp E_l, E_h, [src, 32] + ldp F_l, F_h, [src, 48] +U_M32( sttr E_l, [dstin, 32]) +U_M40( sttr E_h, [dstin, 40]) +U_M48( sttr F_l, [dstin, 48]) +U_M56( sttr F_h, [dstin, 56]) + ldp A_l, A_h, [srcend, -64] + ldp B_l, B_h, [srcend, -48] +U_M64( sttr A_l, [dstend, -64]) +U_MT56( sttr A_h, [dstend, -56]) +U_MT48( sttr B_l, [dstend, -48]) +U_MT40( sttr B_h, [dstend, -40]) +L(copy64): +U_MT32( sttr C_l, [dstend, -32]) +U_MT24( sttr C_h, [dstend, -24]) +U_MT16( sttr D_l, [dstend, -16]) +U_MT8( sttr D_h, [dstend, -8]) + mov x0, #0 + ret + + .p2align 4 + nop + nop + nop + /* Copy more than 128 bytes. */ +L(copy_long): + /* Copy 16 bytes and then align dst to 16-byte alignment. */ + + ldp D_l, D_h, [src] + and tmp1, dstin, 15 + bic dst, dstin, 15 + sub src, src, tmp1 + add count, count, tmp1 /* Count is now 16 too large. */ + ldp A_l, A_h, [src, 16] +U_pre( sttr D_l, [dstin]) +U_S8( sttr D_h, [dstin, 8]) + ldp B_l, B_h, [src, 32] + ldp C_l, C_h, [src, 48] + ldp D_l, D_h, [src, 64]! + subs count, count, 128 + 16 /* Test and readjust count. */ + b.ls L(copy64_from_end) + +L(loop64): +U_L16( sttr A_l, [dst, 16]) +U_L24( sttr A_h, [dst, 24]) + ldp A_l, A_h, [src, 16] +U_L32( sttr B_l, [dst, 32]) +U_L40( sttr B_h, [dst, 40]) + ldp B_l, B_h, [src, 32] +U_L48( sttr C_l, [dst, 48]) +U_L56( sttr C_h, [dst, 56]) + ldp C_l, C_h, [src, 48] +U_L64( sttr D_l, [dst, 64]) +U_L72( sttr D_h, [dst, 72]) + add dst, dst, #64 + ldp D_l, D_h, [src, 64]! + subs count, count, 64 + b.hi L(loop64) + + /* Write the last iteration and copy 64 bytes from the end. */ +L(copy64_from_end): + ldp E_l, E_h, [srcend, -64] +U_L16( sttr A_l, [dst, 16]) +U_L24( sttr A_h, [dst, 24]) + ldp A_l, A_h, [srcend, -48] +U_L32( sttr B_l, [dst, 32]) +U_L40( sttr B_h, [dst, 40]) + ldp B_l, B_h, [srcend, -32] +U_L48( sttr C_l, [dst, 48]) +U_L56( sttr C_h, [dst, 56]) + ldp C_l, C_h, [srcend, -16] +U_L64( sttr D_l, [dst, 64]) +U_L72( sttr D_h, [dst, 72]) +U_LT64( sttr E_l, [dstend, -64]) +U_LT56( sttr E_h, [dstend, -56]) +U_LT48( sttr A_l, [dstend, -48]) +U_LT40( sttr A_h, [dstend, -40]) +U_LT32( sttr B_l, [dstend, -32]) +U_LT24( sttr B_h, [dstend, -24]) +U_LT16( sttr C_l, [dstend, -16]) +U_LT8( sttr C_h, [dstend, -8]) + mov x0, #0 + ret + + /* Fixups... */ + + /* + * Fault on the first write, but progress may have been possible; + * realign dst and retry a single byte to confirm. + */ +L(fixup_pre): + mov dst, dstin +U_dst( ldtrb A_lw, [src]) + strb A_lw, [dst], #1 +L(fixup_dst): + sub x0, dstend, dst + ret + + /* Small: Fault with 1 byte remaining, regardless of count */ +L(fixup_s1): + mov x0, #1 + ret + + /* Small tail case: Fault 8 bytes before dstend, >=16 bytes written */ +L(fixup_st8): + sub dst, dstend, #8 + add dstin, dstin, #16 +L(fixup_tail): + cmp dst, dstin + csel dst, dst, dstin, hi + b L(fixup_dst) + + /* Small/medium: Faults n bytes past dtsin, that much written */ +L(fixup_m64): + add dstin, dstin, #8 +L(fixup_m56): + add dstin, dstin, #8 +L(fixup_m48): + add dstin, dstin, #8 +L(fixup_m40): + add dstin, dstin, #8 +L(fixup_m32): + add dstin, dstin, #8 +L(fixup_m24): + add dstin, dstin, #8 +L(fixup_s16): + add dstin, dstin, #8 +L(fixup_s8): + add dstin, dstin, #4 +L(fixup_s4): + add dst, dstin, #4 + b L(fixup_dst) + + /* + * Medium tail cases: Faults n bytes before dstend, 64 or 32 bytes + * past dstin written, depending on original count + */ +L(fixup_mt56): + sub count, count, #8 +L(fixup_mt48): + sub count, count, #8 +L(fixup_mt40): + sub count, count, #8 +L(fixup_mt32): + sub count, count, #8 +L(fixup_mt24): + sub count, count, #8 +L(fixup_mt16): + sub count, count, #8 +L(fixup_mt8): + add count, count, #8 + add dst, dstin, count + + sub tmp1, dstend, dstin + cmp tmp1, #64 + add tmp1, dstin, #64 + add dstin, dstin, #32 + csel dstin, dstin, tmp1, ls + b L(fixup_tail) + + /* Large: Faults n bytes past dst, at least 16 bytes past dstin written */ +L(fixup_l72): + add dst, dst, #8 +L(fixup_l64): + add dst, dst, #8 +L(fixup_l56): + add dst, dst, #8 +L(fixup_l48): + add dst, dst, #8 +L(fixup_l40): + add dst, dst, #8 +L(fixup_l32): + add dst, dst, #8 +L(fixup_l24): + add dst, dst, #8 +L(fixup_l16): + add dst, dst, #16 + add dstin, dstin, #16 + b L(fixup_tail) + + /* Large tail: Faults n bytes before dstend, 80 bytes past dst written */ +L(fixup_lt64): + sub count, count, #8 +L(fixup_lt56): + sub count, count, #8 +L(fixup_lt48): + sub count, count, #8 +L(fixup_lt40): + sub count, count, #8 +L(fixup_lt32): + sub count, count, #8 +L(fixup_lt24): + sub count, count, #8 +L(fixup_lt16): + sub count, count, #8 +L(fixup_lt8): + add count, count, #56 /* Count was also off by 64 */ + add dstin, dst, #80 + add dst, dst, count + b L(fixup_tail) + SYM_FUNC_END(__arch_copy_to_user) EXPORT_SYMBOL(__arch_copy_to_user)
As with its counterpart, replace copy_to_user() with a new and improved version similarly derived from memcpy(). Different tradeoffs from the base implementation are made relative to copy_from_user() to get the most consistent results across different microarchitectures, but the overall shape of the performance gain ends up about the same. The exception fixups are even more comical this time around, but that's down to now needing to reconstruct the faulting address, and cope with overlapping stores at various points. Again, improvements to the exception mechanism could significantly simplify things in future. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- arch/arm64/lib/copy_to_user.S | 391 +++++++++++++++++++++++++++++----- 1 file changed, 341 insertions(+), 50 deletions(-)