mbox series

[v2,0/5] slab cleanups

Message ID 20220304063427.372145-1-42.hyeyoo@gmail.com (mailing list archive)
Headers show
Series slab cleanups | expand

Message

Hyeonggon Yoo March 4, 2022, 6:34 a.m. UTC
Changes from v1:
	Now SLAB passes requests larger than order-1 page
	to page allocator.

	Adjusted comments from Matthew, Vlastimil, Rientjes.
	Thank you for feedback!

	BTW, I have no idea what __ksize() should return when an object that
	is not allocated from slab is passed. both 0 and folio_size()
	seems wrong to me.

Hello, these are cleanup patches for slab.
Please consider them for slab-next :)

Any comments will be appreciated.
Thanks.

Hyeonggon Yoo (5):
  mm/slab: kmalloc: pass requests larger than order-1 page to page
    allocator
  mm/sl[au]b: unify __ksize()
  mm/sl[auo]b: move definition of __ksize() to mm/slab.h
  mm/slub: limit number of node partial slabs only in cache creation
  mm/slub: refactor deactivate_slab()

 include/linux/slab.h |  36 ++++++------
 mm/slab.c            |  51 ++++++++---------
 mm/slab.h            |  21 +++++++
 mm/slab_common.c     |  20 +++++++
 mm/slob.c            |   1 -
 mm/slub.c            | 130 ++++++++++++-------------------------------
 6 files changed, 114 insertions(+), 145 deletions(-)

Comments

Marco Elver March 4, 2022, 11:50 a.m. UTC | #1
On Fri, 4 Mar 2022 at 07:34, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
>
> Changes from v1:
>         Now SLAB passes requests larger than order-1 page
>         to page allocator.
>
>         Adjusted comments from Matthew, Vlastimil, Rientjes.
>         Thank you for feedback!
>
>         BTW, I have no idea what __ksize() should return when an object that
>         is not allocated from slab is passed. both 0 and folio_size()
>         seems wrong to me.

Didn't we say 0 would be the safer of the two options?
https://lkml.kernel.org/r/0e02416f-ef43-dc8a-9e8e-50ff63dd3c61@suse.cz

> Hello, these are cleanup patches for slab.
> Please consider them for slab-next :)
>
> Any comments will be appreciated.
> Thanks.
>
> Hyeonggon Yoo (5):
>   mm/slab: kmalloc: pass requests larger than order-1 page to page
>     allocator
>   mm/sl[au]b: unify __ksize()
>   mm/sl[auo]b: move definition of __ksize() to mm/slab.h
>   mm/slub: limit number of node partial slabs only in cache creation
>   mm/slub: refactor deactivate_slab()
>
>  include/linux/slab.h |  36 ++++++------
>  mm/slab.c            |  51 ++++++++---------
>  mm/slab.h            |  21 +++++++
>  mm/slab_common.c     |  20 +++++++
>  mm/slob.c            |   1 -
>  mm/slub.c            | 130 ++++++++++++-------------------------------
>  6 files changed, 114 insertions(+), 145 deletions(-)
>
> --
> 2.33.1
>
Hyeonggon Yoo March 4, 2022, 12:02 p.m. UTC | #2
On Fri, Mar 04, 2022 at 12:50:21PM +0100, Marco Elver wrote:
> On Fri, 4 Mar 2022 at 07:34, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
> >
> > Changes from v1:
> >         Now SLAB passes requests larger than order-1 page
> >         to page allocator.
> >
> >         Adjusted comments from Matthew, Vlastimil, Rientjes.
> >         Thank you for feedback!
> >
> >         BTW, I have no idea what __ksize() should return when an object that
> >         is not allocated from slab is passed. both 0 and folio_size()
> >         seems wrong to me.
> 
> Didn't we say 0 would be the safer of the two options?
> https://lkml.kernel.org/r/0e02416f-ef43-dc8a-9e8e-50ff63dd3c61@suse.cz
>

Oh sorry, I didn't understand why 0 was safer when I was reading it.

Reading again, 0 is safer because kasan does not unpoison for
wrongly passed object, right?

> > Hello, these are cleanup patches for slab.
> > Please consider them for slab-next :)
> >
> > Any comments will be appreciated.
> > Thanks.
> >
> > Hyeonggon Yoo (5):
> >   mm/slab: kmalloc: pass requests larger than order-1 page to page
> >     allocator
> >   mm/sl[au]b: unify __ksize()
> >   mm/sl[auo]b: move definition of __ksize() to mm/slab.h
> >   mm/slub: limit number of node partial slabs only in cache creation
> >   mm/slub: refactor deactivate_slab()
> >
> >  include/linux/slab.h |  36 ++++++------
> >  mm/slab.c            |  51 ++++++++---------
> >  mm/slab.h            |  21 +++++++
> >  mm/slab_common.c     |  20 +++++++
> >  mm/slob.c            |   1 -
> >  mm/slub.c            | 130 ++++++++++++-------------------------------
> >  6 files changed, 114 insertions(+), 145 deletions(-)
> >
> > --
> > 2.33.1
> >
Marco Elver March 4, 2022, 1:11 p.m. UTC | #3
On Fri, 4 Mar 2022 at 13:02, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
>
> On Fri, Mar 04, 2022 at 12:50:21PM +0100, Marco Elver wrote:
> > On Fri, 4 Mar 2022 at 07:34, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
> > >
> > > Changes from v1:
> > >         Now SLAB passes requests larger than order-1 page
> > >         to page allocator.
> > >
> > >         Adjusted comments from Matthew, Vlastimil, Rientjes.
> > >         Thank you for feedback!
> > >
> > >         BTW, I have no idea what __ksize() should return when an object that
> > >         is not allocated from slab is passed. both 0 and folio_size()
> > >         seems wrong to me.
> >
> > Didn't we say 0 would be the safer of the two options?
> > https://lkml.kernel.org/r/0e02416f-ef43-dc8a-9e8e-50ff63dd3c61@suse.cz
> >
>
> Oh sorry, I didn't understand why 0 was safer when I was reading it.
>
> Reading again, 0 is safer because kasan does not unpoison for
> wrongly passed object, right?

Not quite. KASAN can tell if something is wrong, i.e. invalid object.
Similarly, if you are able to tell if the passed pointer is not a
valid object some other way, you can do something better - namely,
return 0. The intuition here is that the caller has a pointer to an
invalid object, and wants to use ksize() to determine its size, and
most likely access all those bytes. Arguably, at that point the kernel
is already in a degrading state. But we can try to not let things get
worse by having ksize() return 0, in the hopes that it will stop
corrupting more memory. It won't work in all cases, but should avoid
things like "s = ksize(obj); touch_all_bytes(obj, s)" where the size
bounds the memory accessed corrupting random memory.

The other reason is that a caller could actually check the size, and
if 0, do something else. Few callers will do so, because nobody
expects that their code has a bug. :-)
Vlastimil Babka March 4, 2022, 4:42 p.m. UTC | #4
On 3/4/22 14:11, Marco Elver wrote:
> On Fri, 4 Mar 2022 at 13:02, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
>>
>> On Fri, Mar 04, 2022 at 12:50:21PM +0100, Marco Elver wrote:
>> > On Fri, 4 Mar 2022 at 07:34, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
>> > >
>> > > Changes from v1:
>> > >         Now SLAB passes requests larger than order-1 page
>> > >         to page allocator.
>> > >
>> > >         Adjusted comments from Matthew, Vlastimil, Rientjes.
>> > >         Thank you for feedback!
>> > >
>> > >         BTW, I have no idea what __ksize() should return when an object that
>> > >         is not allocated from slab is passed. both 0 and folio_size()
>> > >         seems wrong to me.
>> >
>> > Didn't we say 0 would be the safer of the two options?
>> > https://lkml.kernel.org/r/0e02416f-ef43-dc8a-9e8e-50ff63dd3c61@suse.cz
>> >
>>
>> Oh sorry, I didn't understand why 0 was safer when I was reading it.
>>
>> Reading again, 0 is safer because kasan does not unpoison for
>> wrongly passed object, right?
> 
> Not quite. KASAN can tell if something is wrong, i.e. invalid object.
> Similarly, if you are able to tell if the passed pointer is not a
> valid object some other way, you can do something better - namely,
> return 0.

Hmm, but how paranoid do we have to be? Patch 1 converts SLAB to use
kmalloc_large(). So it's now legitimate to have objects allocated by SLAB's
kmalloc() that don't have a slab folio flag set, and their size is
folio_size(). It would be more common than getting a bogus pointer, so
should we return 0 just because a bogus pointer is possible? If we do that,
then KASAN will fail to unpoison legitimate kmalloc_large() objects, no?
What I suggested earlier is we could make the checks more precise - if
folio_size() is smaller or equal order-1 page, then it's bogus because we
only do kmalloc_large() for >order-1. If the object pointer is not to the
beginning of the folio, then it's bogus, because kmalloc_large() returns the
beginning of the folio. Then in these case we return 0, but otherwise we
should return folio_size()?

> The intuition here is that the caller has a pointer to an
> invalid object, and wants to use ksize() to determine its size, and
> most likely access all those bytes. Arguably, at that point the kernel
> is already in a degrading state. But we can try to not let things get
> worse by having ksize() return 0, in the hopes that it will stop
> corrupting more memory. It won't work in all cases, but should avoid
> things like "s = ksize(obj); touch_all_bytes(obj, s)" where the size
> bounds the memory accessed corrupting random memory.
> 
> The other reason is that a caller could actually check the size, and
> if 0, do something else. Few callers will do so, because nobody
> expects that their code has a bug. :-)
Marco Elver March 4, 2022, 4:45 p.m. UTC | #5
On Fri, 4 Mar 2022 at 17:42, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 3/4/22 14:11, Marco Elver wrote:
> > On Fri, 4 Mar 2022 at 13:02, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
> >>
> >> On Fri, Mar 04, 2022 at 12:50:21PM +0100, Marco Elver wrote:
> >> > On Fri, 4 Mar 2022 at 07:34, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
> >> > >
> >> > > Changes from v1:
> >> > >         Now SLAB passes requests larger than order-1 page
> >> > >         to page allocator.
> >> > >
> >> > >         Adjusted comments from Matthew, Vlastimil, Rientjes.
> >> > >         Thank you for feedback!
> >> > >
> >> > >         BTW, I have no idea what __ksize() should return when an object that
> >> > >         is not allocated from slab is passed. both 0 and folio_size()
> >> > >         seems wrong to me.
> >> >
> >> > Didn't we say 0 would be the safer of the two options?
> >> > https://lkml.kernel.org/r/0e02416f-ef43-dc8a-9e8e-50ff63dd3c61@suse.cz
> >> >
> >>
> >> Oh sorry, I didn't understand why 0 was safer when I was reading it.
> >>
> >> Reading again, 0 is safer because kasan does not unpoison for
> >> wrongly passed object, right?
> >
> > Not quite. KASAN can tell if something is wrong, i.e. invalid object.
> > Similarly, if you are able to tell if the passed pointer is not a
> > valid object some other way, you can do something better - namely,
> > return 0.
>
> Hmm, but how paranoid do we have to be? Patch 1 converts SLAB to use
> kmalloc_large(). So it's now legitimate to have objects allocated by SLAB's
> kmalloc() that don't have a slab folio flag set, and their size is
> folio_size(). It would be more common than getting a bogus pointer, so
> should we return 0 just because a bogus pointer is possible?

No of course not, which is why I asked in the earlier email if it's a
"definitive failure case".

> If we do that,
> then KASAN will fail to unpoison legitimate kmalloc_large() objects, no?
> What I suggested earlier is we could make the checks more precise - if
> folio_size() is smaller or equal order-1 page, then it's bogus because we
> only do kmalloc_large() for >order-1. If the object pointer is not to the
> beginning of the folio, then it's bogus, because kmalloc_large() returns the
> beginning of the folio. Then in these case we return 0, but otherwise we
> should return folio_size()?
>
> > The intuition here is that the caller has a pointer to an
> > invalid object, and wants to use ksize() to determine its size, and
> > most likely access all those bytes. Arguably, at that point the kernel
> > is already in a degrading state. But we can try to not let things get
> > worse by having ksize() return 0, in the hopes that it will stop
> > corrupting more memory. It won't work in all cases, but should avoid
> > things like "s = ksize(obj); touch_all_bytes(obj, s)" where the size
> > bounds the memory accessed corrupting random memory.
> >
> > The other reason is that a caller could actually check the size, and
> > if 0, do something else. Few callers will do so, because nobody
> > expects that their code has a bug. :-)
>
Hyeonggon Yoo March 5, 2022, 4 a.m. UTC | #6
On Fri, Mar 04, 2022 at 02:11:50PM +0100, Marco Elver wrote:
> On Fri, 4 Mar 2022 at 13:02, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
> >
> > On Fri, Mar 04, 2022 at 12:50:21PM +0100, Marco Elver wrote:
> > > On Fri, 4 Mar 2022 at 07:34, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
> > > >
> > > > Changes from v1:
> > > >         Now SLAB passes requests larger than order-1 page
> > > >         to page allocator.
> > > >
> > > >         Adjusted comments from Matthew, Vlastimil, Rientjes.
> > > >         Thank you for feedback!
> > > >
> > > >         BTW, I have no idea what __ksize() should return when an object that
> > > >         is not allocated from slab is passed. both 0 and folio_size()
> > > >         seems wrong to me.
> > >
> > > Didn't we say 0 would be the safer of the two options?
> > > https://lkml.kernel.org/r/0e02416f-ef43-dc8a-9e8e-50ff63dd3c61@suse.cz
> > >
> >
> > Oh sorry, I didn't understand why 0 was safer when I was reading it.
> >
> > Reading again, 0 is safer because kasan does not unpoison for
> > wrongly passed object, right?
> 
> Not quite. KASAN can tell if something is wrong, i.e. invalid object.
> Similarly, if you are able to tell if the passed pointer is not a
> valid object some other way, you can do something better - namely,
> return 0.
>
> The intuition here is that the caller has a pointer to an
> invalid object, and wants to use ksize() to determine its size, and
> most likely access all those bytes. Arguably, at that point the kernel
> is already in a degrading state. But we can try to not let things get
> worse by having ksize() return 0, in the hopes that it will stop
> corrupting more memory. It won't work in all cases, but should avoid
> things like "s = ksize(obj); touch_all_bytes(obj, s)" where the size
> bounds the memory accessed corrupting random memory.

Oh, it's to prevent to corrupt memory further in failure case,
like memset(obj, 0, s);

> The other reason is that a caller could actually check the size, and
> if 0, do something else. Few callers will do so, because nobody
> expects that their code has a bug. :-)

and making it able to check errors by caller.
Thank you so much for kind explanation.

I'll add what Vlastimil suggested in next series. Thanks!