Message ID | 20241122095746.198762-1-amir73il@gmail.com |
---|---|
State | New |
Headers | show |
Series | [GIT,PULL] overlayfs updates for 6.13 | expand |
On Fri, 22 Nov 2024 at 01:57, Amir Goldstein <amir73il@gmail.com> wrote: > > - Introduction and use of revert/override_creds_light() helpers, that were > suggested by Christian as a mitigation to cache line bouncing and false > sharing of fields in overlayfs creator_cred long lived struct cred copy. So I don't actively hate this, but I do wonder if this shouldn't have been done differently. In particular, I suspect *most* users of override_creds() actually wants this "light" version, because they all already hold a ref to the cred that they want to use as the override. We did it that safe way with the extra refcount not because most people would need it, but it was expected to not be a big deal. Now you found that it *is* a big deal, and instead of just fixing the old interface, you create a whole new interface and the mental burden of having to know the difference between the two. So may I ask that you look at perhaps just converting the (not very many) users of the non-light cred override to the "light" version? Because I suspect you will find that they basically *all* convert. I wouldn't be surprised if some of them could convert automatically with a coccinelle script. Because we actually have several users that have a pattern line old_cred = override_creds(override_cred); /* override_cred() gets its own ref */ put_cred(override_cred); because it *didn't* want the new cred, because it's literally a temporary cred that already had the single ref it needed, and the code actually it wants it to go away when it does revert_creds(old_cred); End result: I suspect what it *really* would have wanted is basically to have 'override_creds()' not do the refcount at all, and at revert time, it would want "revert_creds()" to return the creds that got reverted, and then it would just do old_cred = override_creds(override_cred); ... put_cred(revert_creds(old_cred)); instead - which would not change the refcount on 'old_cred' at all at any time (and does it for the override case only at the end when it actually wants it free'd). And the above is very annoyingly *almost* exactly what your "light" interface does, except your interface is bad too: it doesn't return the reverted creds. So then users have to remember the override_creds *and* the old creds, just to do their own cred refcounting outside of this all. In other words, what I really dislike about this all is that (a) we had a flawed interface (b) you added *another* flawed interface for one special case you cared about (c) now we have *two* flawed interfaces instead of one better one Hmm? Linus
On Fri, 22 Nov 2024 at 21:21, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > So I don't actively hate this, but I do wonder if this shouldn't have > been done differently. Just to clarify: because I understand *why* you wanted this, and because I don't hate it with a passion, I have pulled your changes. But I really think we could and should do better. Please? Linus
The pull request you sent on Fri, 22 Nov 2024 10:57:46 +0100:
> git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs.git ovl-update-6.13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e7675238b9bf4db0b872d5dbcd53efa31914c98f
Thank you!
On Fri, 22 Nov 2024 at 21:21, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > So may I ask that you look at perhaps just converting the (not very > many) users of the non-light cred override to the "light" version? I think you could do a completely automated conversion: (a) add a new "dup_cred()" helper /* Get the cred without clearing the 'non_rcu' flag */ const struct cred *dup_cred(const struct cred *cred) { get_new_cred((struct cred *)cred); return cred; } (b) mindlessly convert: override_creds(cred) -> override_creds_light(dup_cred(cred)) revert_creds(cred) -> put_cred(revert_creds_light(old)); (c) rename away the "_light" again: override_creds_light -> override_creds revert_creds_light -> revert_creds and then finally the only non-automated part would be (d) simplify any obvious and trivial dup_cred -> put_cred chains. which might take some effort, but there should be at least a couple of really obvious cases of "that's not necessary". Because honestly, I think I'd rather see a few cases of old_creds = override_creds(dup_cred(cred)); ... put_cred(revert_creds(old)); that look a bit more complicated, and couldn't be trivially simplified away. That seems better than the current case of having two very different forms of override_creds() / put_cred() where people have to know deeply when to use one or the other. No? Linus
On Fri, Nov 22, 2024 at 10:09:04PM -0800, Linus Torvalds wrote: > (a) add a new "dup_cred()" helper > > /* Get the cred without clearing the 'non_rcu' flag */ > const struct cred *dup_cred(const struct cred *cred) > { get_new_cred((struct cred *)cred); return cred; } Umm... Something like hold_cred() might be better - dup usually implies copying an object... For grapping a reference we normally go for something like hold/get/grab/pin...
On Fri, Nov 22, 2024 at 09:21:58PM -0800, Linus Torvalds wrote: > On Fri, 22 Nov 2024 at 01:57, Amir Goldstein <amir73il@gmail.com> wrote: > > > > - Introduction and use of revert/override_creds_light() helpers, that were > > suggested by Christian as a mitigation to cache line bouncing and false > > sharing of fields in overlayfs creator_cred long lived struct cred copy. > > So I don't actively hate this, but I do wonder if this shouldn't have > been done differently. > > In particular, I suspect *most* users of override_creds() actually > wants this "light" version, because they all already hold a ref to the > cred that they want to use as the override. > > We did it that safe way with the extra refcount not because most > people would need it, but it was expected to not be a big deal. > > Now you found that it *is* a big deal, and instead of just fixing the > old interface, you create a whole new interface and the mental burden > of having to know the difference between the two. > So may I ask that you look at perhaps just converting the (not very > many) users of the non-light cred override to the "light" version? I think that could be a good idea in general. But I have to say I'm feeling a bit defensive after having read your message even though I usually try not to. :) So just to clarify when that issue was brought up I realized that the cred bump was a big deal for overlayfs but from a quick grep I didn't think for any of the other cases it really mattered that much. Realistically, overlayfs is the prime example where that override cred matters big time because it's called everywhere and in all core operations one can think of. But so far I at least haven't heard complaints outside of that and so the immediate focus was to bring about a solution for overlayfs. The reason the revert_creds_light() variant doesn't return the old creds is so that callers don't put_cred() them blindly. Because for overlayfs (and from a quick glance io_uring and nfs) the refcount for the temporary creds is kept completely independent of the callsites. The lifetime is bound to the superblock and so the final put on the temporary creds has nothing to do with the callers at all.
On Fri, 22 Nov 2024 at 22:14, Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Fri, Nov 22, 2024 at 10:09:04PM -0800, Linus Torvalds wrote: > > > (a) add a new "dup_cred()" helper > > > > /* Get the cred without clearing the 'non_rcu' flag */ > > const struct cred *dup_cred(const struct cred *cred) > > { get_new_cred((struct cred *)cred); return cred; } > > Umm... Something like hold_cred() might be better - dup usually > implies copying an object... Ack. "dup" is clearly a horrible name, and I'm ashamed and properly chastised. Linus
On Sat, 23 Nov 2024 at 04:06, Christian Brauner <brauner@kernel.org> wrote: > > So just to clarify when that issue was brought up I realized that the > cred bump was a big deal for overlayfs but from a quick grep I didn't > think for any of the other cases it really mattered that much. Oh, I agree. It's probably not really a performance issue anywhere else. I don't think this has really ever come up before. So my "please convert everything to one single new model" is not because I think that would help performance, but because I really hate having two differently flawed models when I think one would do. We have other situations where we really do have two or more different interfaces for the "same" thing, with very special rules: things like fget() vs fget_raw() vs fget_task() (and similar issues wrt fdget). But I think those other situations have more _reason_ for them. The whole "override_creds()" thing is _already_ such a special operation, that I hate seeing two subtly different versions of the interface, both with their own quirks. Because the old interface really isn't some "perfectly tailored" thing. Yes, the performance implications were a surprise to me and I hadn't seen that before, but the "refcounting isn't wonderful" was _not_ really a big surprise at all. Linus
On Sat, Nov 23, 2024 at 01:06:14PM +0100, Christian Brauner wrote: > On Fri, Nov 22, 2024 at 09:21:58PM -0800, Linus Torvalds wrote: > > On Fri, 22 Nov 2024 at 01:57, Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > - Introduction and use of revert/override_creds_light() helpers, that were > > > suggested by Christian as a mitigation to cache line bouncing and false > > > sharing of fields in overlayfs creator_cred long lived struct cred copy. > > > > So I don't actively hate this, but I do wonder if this shouldn't have > > been done differently. > > > > In particular, I suspect *most* users of override_creds() actually > > wants this "light" version, because they all already hold a ref to the > > cred that they want to use as the override. > > > > We did it that safe way with the extra refcount not because most > > people would need it, but it was expected to not be a big deal. > > > > Now you found that it *is* a big deal, and instead of just fixing the > > old interface, you create a whole new interface and the mental burden > > of having to know the difference between the two. > > > So may I ask that you look at perhaps just converting the (not very > > many) users of the non-light cred override to the "light" version? > > I think that could be a good idea in general. > > But I have to say I'm feeling a bit defensive after having read your > message even though I usually try not to. :) It was just pointed out to me that this was written like I'm not reading you messages - which is obviously not the case. What I means it that I usually try to not be defensive when valid criticism is brought up. :)