Message ID | Y07SYs98z5VNxdZq@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [git,pull] device mapper changes for 6.1 | expand |
On Tue, Oct 18, 2022 at 12:20:50PM -0400, Mike Snitzer wrote: > > - Enhance DM ioctl interface to allow returning an error string to > userspace. Depends on exporting is_vmalloc_or_module_addr() to allow > DM core to conditionally free memory allocated with kasprintf(). That really does not sound like a good idea at all. And it does not seem to have any MM or core maintainer signoffs.
On Tue, Oct 18 2022 at 2:17P -0400, Christoph Hellwig <hch@infradead.org> wrote: > On Tue, Oct 18, 2022 at 12:20:50PM -0400, Mike Snitzer wrote: > > > > - Enhance DM ioctl interface to allow returning an error string to > > userspace. Depends on exporting is_vmalloc_or_module_addr() to allow > > DM core to conditionally free memory allocated with kasprintf(). > > That really does not sound like a good idea at all. And it does not > seem to have any MM or core maintainer signoffs. Sorry, I should've solicited proper review more loudly. But if you have a specific concern with how DM is looking to use is_vmalloc_or_module_addr() please say what that is. Mike
On Tue, Oct 18, 2022 at 11:17 AM Christoph Hellwig <hch@infradead.org> wrote: > > On Tue, Oct 18, 2022 at 12:20:50PM -0400, Mike Snitzer wrote: > > > > - Enhance DM ioctl interface to allow returning an error string to > > userspace. Depends on exporting is_vmalloc_or_module_addr() to allow > > DM core to conditionally free memory allocated with kasprintf(). > > That really does not sound like a good idea at all. And it does not > seem to have any MM or core maintainer signoffs. I wouldn't worry about maintainer sign-offs just for exporting a helper function, but I agree with the whole concept being a complete disaster and not a good idea at all. Use errno. It really is that simple. Strings have been discussed before, and they are simply not a good idea. If your interface is so complicated that you think errors need some textual explanation, your interface is probably garbage. Strings also have allocation issues (as you found out), and have serious localization issues. Yes, we do a lot of strings in the kernel in the form of dmesg, and we have the rule that we simply don't localize. But that's dmesg. It's for special stuff, not some interface. And equally importantly, some really small detail in the kernel really has *NO* business making up new error models of its own. You may think that the DM ioctl's are a big and important deal, but realistically, it's just an odd corner of the world that very very few people care about, and they can use the same error numbers that EVERYBODY ELSE HAS BEEN USING FOR SIX DECADES! Don't reinvent something that works - badly. I think we have one major interface that is string-based (apart from the obvious pathname ones and the strings passed to 'execve()'). It's 'mount()' (and now fsconfig() etc), and it's string-based mainly because it has that nasty "arbitrary things that different filesystem may need for configuration"). And it has some nasty logging model associated with it too for output. But no, we absolutely do *not* want to emulate that particular horror anywhere else. If you think some errors are really important and hard to understand, maybe you can just log them with a ratelimited pr_info() or something. Linus
On Tue, Oct 18, 2022 at 11:54 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > But no, we absolutely do *not* want to emulate that particular horror > anywhere else. Side note: if DM people go "Hmm, a lot of our management really does have the exact same issues as 'mount()' and friends, and doing the same things as mount does with the whole string interface and logging sounds like a good idea", I want to nip that in the bud. It's most definitely *not* a good idea. The mount thing is nasty, it's just that we've always had the odd string interface, and it's just grown from "const void *data" to be a whole complicated set of context handling. So don't even think about duplicating that thing. Now, if some "inspired" person then thinks that instead of duplicating it, you really would want to do device mapper *as* a filesystem and actually using the fsconfig() model directly and natively, that is at least conceptually not necessarily wrong. At what point does a "translate disk blocks and munge contents" turn from "map devices into other devices" to a "map devices into a filesystem"? We've had loop devices forever, and they already show how filesystems and block devices can be a mixed concept. And no, I'm not seriously suggesting that as a "do this instead". I'm just saying that from an interface standpoint, we _do_ have an interface that is kind of doing this, and that is already an area where a lot of people think there is a lot of commonalities (ie a number of filesystems end up doing their own device mapping internally, and people used to say "layering violation - please use dm for that" before they got used/resigned to it because the filesystem people really wanted to control the mapping). In the absence of that kind of unification, just use 'errno'. Linus
On Tue, 18 Oct 2022, Linus Torvalds wrote: > On Tue, Oct 18, 2022 at 11:17 AM Christoph Hellwig <hch@infradead.org> wrote: > > > > On Tue, Oct 18, 2022 at 12:20:50PM -0400, Mike Snitzer wrote: > > > > > > - Enhance DM ioctl interface to allow returning an error string to > > > userspace. Depends on exporting is_vmalloc_or_module_addr() to allow > > > DM core to conditionally free memory allocated with kasprintf(). > > > > That really does not sound like a good idea at all. And it does not > > seem to have any MM or core maintainer signoffs. > > I wouldn't worry about maintainer sign-offs just for exporting a > helper function, but I agree with the whole concept being a complete > disaster and not a good idea at all. > > Use errno. > > It really is that simple. Strings have been discussed before, and they > are simply not a good idea. If your interface is so complicated that > you think errors need some textual explanation, your interface is > probably garbage. > > Strings also have allocation issues (as you found out), and have > serious localization issues. > > Yes, we do a lot of strings in the kernel in the form of dmesg, and we > have the rule that we simply don't localize. But that's dmesg. It's > for special stuff, not some interface. > > And equally importantly, some really small detail in the kernel really > has *NO* business making up new error models of its own. You may think > that the DM ioctl's are a big and important deal, but realistically, > it's just an odd corner of the world that very very few people care > about, and they can use the same error numbers that EVERYBODY ELSE HAS > BEEN USING FOR SIX DECADES! > > Don't reinvent something that works - badly. > > I think we have one major interface that is string-based (apart from > the obvious pathname ones and the strings passed to 'execve()'). > > It's 'mount()' (and now fsconfig() etc), and it's string-based mainly > because it has that nasty "arbitrary things that different filesystem > may need for configuration"). And it has some nasty logging model > associated with it too for output. > > But no, we absolutely do *not* want to emulate that particular horror > anywhere else. > > If you think some errors are really important and hard to understand, > maybe you can just log them with a ratelimited pr_info() or something. This is what we currently do. > Linus The error string is not intended to be parsed by userspace (I agree that parsing the error string is a horrible idea, but this is not going to happen). It is intended to be displayed to the user by tools such as cryptsetup or integritysetup. The tool can't read the log, extract messages from it and display them. With "just use errno", the user sees messages like "device-mapper: reload ioctl on test (254:0) failed: No such file or directory" and it's not much useful because it doesn't tell what went wrong. Try to type "grep -r 'ti->error = ' drivers/md/|wc -l". There are 480 distinct error messages generated by device mapper. You can't map each of them to a unique errno number. BTW. we were talking about replacing device mapper version numbers with feature bitmaps and people preferred textual lists of features instead of bitmaps (because the bitmap will overflow when you have more than 64 features). Do you oppose to this too? Do you prefer a 64-bit feature bitmap or a string with comma-separated list of features? Mikulas
On Tue, Oct 18 2022 at 3:19P -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Oct 18, 2022 at 11:54 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > But no, we absolutely do *not* want to emulate that particular horror > > anywhere else. > > Side note: if DM people go "Hmm, a lot of our management really does > have the exact same issues as 'mount()' and friends, and doing the > same things as mount does with the whole string interface and logging > sounds like a good idea", I want to nip that in the bud. > > It's most definitely *not* a good idea. The mount thing is nasty, it's > just that we've always had the odd string interface, and it's just > grown from "const void *data" to be a whole complicated set of context > handling. > > So don't even think about duplicating that thing. > > Now, if some "inspired" person then thinks that instead of duplicating > it, you really would want to do device mapper *as* a filesystem and > actually using the fsconfig() model directly and natively, that is at > least conceptually not necessarily wrong. At what point does a > "translate disk blocks and munge contents" turn from "map devices into > other devices" to a "map devices into a filesystem"? We've had loop > devices forever, and they already show how filesystems and block > devices can be a mixed concept. > > And no, I'm not seriously suggesting that as a "do this instead". > > I'm just saying that from an interface standpoint, we _do_ have an > interface that is kind of doing this, and that is already an area > where a lot of people think there is a lot of commonalities (ie a > number of filesystems end up doing their own device mapping > internally, and people used to say "layering violation - please use dm > for that" before they got used/resigned to it because the filesystem > people really wanted to control the mapping). > > In the absence of that kind of unification, just use 'errno'. Mikulas touches on why why using errno isn't useful for DM. And for DM's device stacking it is hard to know which error spewed to dmesg (via DMERR, DMCRIT, DMINFO, etc) is relevant to a particular admin interface issuing the DM ioctl. So the idea to send the (hopefully useful) error string back up to the relevant userspace consumer was one task that seemed needed (based on Milan Broz's various complaints against DM.. which sprang from your regular remainder that DM's version numbers are broken and need to be removed/replaced). Making DM errors more useful to the endpoints causing them was dealt with head-on with a couple small changes in this pull request. I didn't think sending useful error strings to userspace was such a contested design point. All said, we'll have a look at dealing with your suggested fsconfig unification (but it seems really awkward on the surface, maybe we can at least distill out some subset that is common). Mike
On 10/18/22 23:13, Mike Snitzer wrote: ... >> In the absence of that kind of unification, just use 'errno'. > > Mikulas touches on why why using errno isn't useful for DM. And for > DM's device stacking it is hard to know which error spewed to dmesg > (via DMERR, DMCRIT, DMINFO, etc) is relevant to a particular admin > interface issuing the DM ioctl. > > So the idea to send the (hopefully useful) error string back up to the > relevant userspace consumer was one task that seemed needed (based on > Milan Broz's various complaints against DM.. which sprang from your > regular remainder that DM's version numbers are broken and need to be > removed/replaced). Well, when you mention my complaints, I think we are moving from one extreme to another. For the error reporting - we use errno values in libcryptsetup everywhere, so if DM ioctls (through libdevmapper we use) returns proper errno, this is the minimal solution that helps here. The problem is that ioctl() are often just translated to -EINVAL. (Or lost in libdevmapper compatibility layers.) From the dm-crypt/verity/integrity perspective, ENOMEM, ENODEV (bad block device), ENOTSUP/ENOENT (for crypto algs not available), EIO for IO error, EILSEQ for data integrity failure is the basic what we need. (I perhaps forgot some, I can go through the code if you need it.) As a bonus, if DM ioctl() returns fixed string that describes user-friendly error (like: "keysize not compatible" or so) that's all we need (ti->error strings are already in DM targets). I have never asked for dynamically allocated error strings in kernel and I do not know Mikulas' motivation to implement it. Milan
On Tue, Oct 18, 2022 at 1:29 PM Mikulas Patocka <mpatocka@redhat.com> wrote: > > The error string is not intended to be parsed by userspace (I agree that > parsing the error string is a horrible idea, but this is not going to > happen). I am happy we agree on that fundamental issue. But it's also why error strings are a HORRIBLE HORRIBLE idea. They are literally worse than just plain 'errno', exactly because user space MUST NOT parse them. They are worse because: - user space will parse them anyway, for localization reasons, so the whole design is garbage - user space that correctly doesn't parse them cannot use them for any helpful thing apart from just displaying them, which makes them actively worse than having a number that you *can* make actual decision on. In other words, user space either can violate the fundamental rule of "don't parse it", or they are actively weaker than then errno numbers we already have. Either way, they are worse. See? > It is intended to be displayed to the user by tools such as > cryptsetup or integritysetup. The tool can't read the log, extract > messages from it and display them. Bullshit. The tools could just use the error number and together with the operation that failed, make a very good assumption on what went wrong. And even when that assumption isn't some 100% "this is the reason", the tool can easily print out helpful hints, like "This is often because of Xyz". And guess what? With an errno, the tool can do this MUCH BETTER. It can localize the error messages. It can do different things for different error messages. And it will work with old kernels too. > With "just use errno", the user sees messages like "device-mapper: reload > ioctl on test (254:0) failed: No such file or directory" and it's not much > useful because it doesn't tell what went wrong. Again, I call bullshit. You are saying "the tools isn't helpful, so let's change the kernel interface". And that's just plain stupid. if the tool isn't helpful, then FIX THE TOOL. It's that simple. The fact is, dm isn't special. We use 'errno' absolutely everywhere else. What makes dm so special that the dm tools can't deal well with them? Look at the profile tools (just to give an example of a tool that is in the kernel tree, so i can grep for it). Sometimes it does just if (aio_errno != EINTR) pr_err("failed to write perf data, error: %m\n"); and prints that error string. But sometimes it does things like if (errno == EPERM) { pr_err("Permission error mapping pages.\n" "Consider increasing " "/proc/sys/kernel/perf_event_mlock_kb,\n" "or try again with a smaller value of -m/--mmap_pages.\n" "(current value: %u,%u)\n", opts->mmap_pages, opts->auxtrace_mmap_pages); because the tool isn't garbage. You are basically saying that the kernel should generate those strings. And I'm saying you are completely wrong, and that no, I will not pull this kind of silly interface, because it's an actively horribly broken garbage interface. Furthermore, I'm telling you that you need to really *understand* that no, device-mapper isn't some super-special thing that magically should do string errors. This is not something worth discussing. This is something where you need to just realize that you are wrong. End of story. Linus
On Tue, Oct 18, 2022 at 3:11 PM Milan Broz <gmazyland@gmail.com> wrote: > > The problem is that ioctl() are often just translated to -EINVAL. Oh, that's absolutely a real problem. Don't use one single error number. Linus
On Tue, Oct 18, 2022 at 11:54:49AM -0700, Linus Torvalds wrote: > I think we have one major interface that is string-based (apart from > the obvious pathname ones and the strings passed to 'execve()'). > > It's 'mount()' (and now fsconfig() etc), and it's string-based mainly > because it has that nasty "arbitrary things that different filesystem > may need for configuration"). And it has some nasty logging model > associated with it too for output. > > But no, we absolutely do *not* want to emulate that particular horror > anywhere else. This might ruin your day, but FYI, Netlink [...did you already run off screaming at the mention of Netlink?...] Netlink has the whole "extack" extended acknowledgement mechanism, in which netlink replies can and do contain error strings. Grep the kernel for NL_SET_ERR_MSG and you'll see a bunch of these. (And as you suggested, I wouldn't be surprised if some bad userspaces parse them.) Jason
On Tue, Oct 18, 2022 at 02:48:01PM -0400, Mike Snitzer wrote: > > That really does not sound like a good idea at all. And it does not > > seem to have any MM or core maintainer signoffs. > > Sorry, I should've solicited proper review more loudly. > > But if you have a specific concern with how DM is looking to use > is_vmalloc_or_module_addr() please say what that is. If I understand the patch correct it tries to use it to detect if a string is a string global. Besides being nasty API abuse I can't see how this would even work if DM is built in.
On Tue, Oct 18, 2022 at 11:54:49AM -0700, Linus Torvalds wrote: > I wouldn't worry about maintainer sign-offs just for exporting a > helper function, but I agree with the whole concept being a complete > disaster and not a good idea at all. It's not just a a "helper", it is internal magic for KASAN and low-level code patching. No driver has any business checking if something is a module text/data address, and I'm fairly sure how they are using it is actually wrong if DM is built in. FYI, I also agree that the concept is a disaster as well.
On Thu, 20 Oct 2022, Christoph Hellwig wrote: > On Tue, Oct 18, 2022 at 02:48:01PM -0400, Mike Snitzer wrote: > > > That really does not sound like a good idea at all. And it does not > > > seem to have any MM or core maintainer signoffs. > > > > Sorry, I should've solicited proper review more loudly. > > > > But if you have a specific concern with how DM is looking to use > > is_vmalloc_or_module_addr() please say what that is. > > If I understand the patch correct it tries to use it to detect if > a string is a string global. Besides being nasty API abuse I can't > see how this would even work if DM is built in. It works for built-in DM. You have "kfree_const" that detects if the string points to to .rodata with "is_kernel_rodata". Unfortunatelly, is_kernel_rodata doesn't detect if the string points to some modules's rodata, so we need an extra check for that. So, the logic is: if (!is_vmalloc_or_module_addr(ptr) && !is_kernel_rodata(ptr)) kfree(ptr); I thought about modifying is_kernel_rodata to detect module's rodata as well, but it wouldn't work because kstrdup_const uses it and there would be crash possibility: ptr = kstrdup_const(modules_rodata); unload_module(); use "ptr"; Mikulas