diff mbox series

alloc_tag: Tighten file permissions on /proc/allocinfo

Message ID 20240425200844.work.184-kees@kernel.org (mailing list archive)
State Mainlined
Commit e13106952faad91c6e492bf23b7cbdf1b1c269ce
Headers show
Series alloc_tag: Tighten file permissions on /proc/allocinfo | expand

Commit Message

Kees Cook April 25, 2024, 8:08 p.m. UTC
The /proc/allocinfo file exposes a tremendous about of information about
kernel build details, memory allocations (obviously), and potentially
even image layout (due to ordering). As this is intended to be consumed
by system owners (like /proc/slabinfo), use the same file permissions as
there: 0400.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
---
 lib/alloc_tag.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Kent Overstreet April 25, 2024, 8:45 p.m. UTC | #1
On Thu, Apr 25, 2024 at 01:08:50PM -0700, Kees Cook wrote:
> The /proc/allocinfo file exposes a tremendous about of information about
> kernel build details, memory allocations (obviously), and potentially
> even image layout (due to ordering). As this is intended to be consumed
> by system owners (like /proc/slabinfo), use the same file permissions as
> there: 0400.

Err...

The side effect of locking down more and more reporting interfaces is
that programs that consume those interfaces now have to run as root.

That's not what we want.
Matthew Wilcox (Oracle) April 25, 2024, 8:51 p.m. UTC | #2
On Thu, Apr 25, 2024 at 04:45:51PM -0400, Kent Overstreet wrote:
> On Thu, Apr 25, 2024 at 01:08:50PM -0700, Kees Cook wrote:
> > The /proc/allocinfo file exposes a tremendous about of information about
> > kernel build details, memory allocations (obviously), and potentially
> > even image layout (due to ordering). As this is intended to be consumed
> > by system owners (like /proc/slabinfo), use the same file permissions as
> > there: 0400.
> 
> Err...
> 
> The side effect of locking down more and more reporting interfaces is
> that programs that consume those interfaces now have to run as root.

sudo cat /proc/allocinfo | analyse-that-fie
Kees Cook April 25, 2024, 8:57 p.m. UTC | #3
On Thu, Apr 25, 2024 at 04:45:51PM -0400, Kent Overstreet wrote:
> On Thu, Apr 25, 2024 at 01:08:50PM -0700, Kees Cook wrote:
> > The /proc/allocinfo file exposes a tremendous about of information about
> > kernel build details, memory allocations (obviously), and potentially
> > even image layout (due to ordering). As this is intended to be consumed
> > by system owners (like /proc/slabinfo), use the same file permissions as
> > there: 0400.
> 
> The side effect of locking down more and more reporting interfaces is
> that programs that consume those interfaces now have to run as root.

I'm fine if you want to tie it to some existing capability, but it
shouldn't be world-readable. Also, plenty of diagnostic tools already
either run as root or open whatever files they need to before dropping
privs.
Kent Overstreet April 25, 2024, 9:04 p.m. UTC | #4
On Thu, Apr 25, 2024 at 09:51:56PM +0100, Matthew Wilcox wrote:
> On Thu, Apr 25, 2024 at 04:45:51PM -0400, Kent Overstreet wrote:
> > On Thu, Apr 25, 2024 at 01:08:50PM -0700, Kees Cook wrote:
> > > The /proc/allocinfo file exposes a tremendous about of information about
> > > kernel build details, memory allocations (obviously), and potentially
> > > even image layout (due to ordering). As this is intended to be consumed
> > > by system owners (like /proc/slabinfo), use the same file permissions as
> > > there: 0400.
> > 
> > Err...
> > 
> > The side effect of locking down more and more reporting interfaces is
> > that programs that consume those interfaces now have to run as root.
> 
> sudo cat /proc/allocinfo | analyse-that-fie

Even that is still an annoyance, but I'm thinking more about a future
daemon to collect this every n seconds - that really shouldn't need to
be root.

And the "lock everything down" approach really feels like paranoia gone
too far - what's next, /proc/cpuinfo? Do we really want to go the
Windows approach of UAC pop ups for everything? I'd rather be going the
opposite direction, of making it as easy as possible for users to see
what's going on with their machine.

Instead, why not a sysctl, like we already have for perf?

The concern about leaking image layout could be addressed by sorting the
output before returning to userspace.
Suren Baghdasaryan April 25, 2024, 9:21 p.m. UTC | #5
On Thu, Apr 25, 2024 at 2:04 PM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
>
> On Thu, Apr 25, 2024 at 09:51:56PM +0100, Matthew Wilcox wrote:
> > On Thu, Apr 25, 2024 at 04:45:51PM -0400, Kent Overstreet wrote:
> > > On Thu, Apr 25, 2024 at 01:08:50PM -0700, Kees Cook wrote:
> > > > The /proc/allocinfo file exposes a tremendous about of information about
> > > > kernel build details, memory allocations (obviously), and potentially
> > > > even image layout (due to ordering). As this is intended to be consumed
> > > > by system owners (like /proc/slabinfo), use the same file permissions as
> > > > there: 0400.
> > >
> > > Err...
> > >
> > > The side effect of locking down more and more reporting interfaces is
> > > that programs that consume those interfaces now have to run as root.
> >
> > sudo cat /proc/allocinfo | analyse-that-fie
>
> Even that is still an annoyance, but I'm thinking more about a future
> daemon to collect this every n seconds - that really shouldn't need to
> be root.

Yeah, that would preclude some nice usecases. Could we maybe use
CAP_SYS_ADMIN checks instead? That way we can still use it from a
non-root process?

>
> And the "lock everything down" approach really feels like paranoia gone
> too far - what's next, /proc/cpuinfo? Do we really want to go the
> Windows approach of UAC pop ups for everything? I'd rather be going the
> opposite direction, of making it as easy as possible for users to see
> what's going on with their machine.
>
> Instead, why not a sysctl, like we already have for perf?
>
> The concern about leaking image layout could be addressed by sorting the
> output before returning to userspace.
Kent Overstreet April 25, 2024, 9:25 p.m. UTC | #6
On Thu, Apr 25, 2024 at 02:21:39PM -0700, Suren Baghdasaryan wrote:
> On Thu, Apr 25, 2024 at 2:04 PM Kent Overstreet
> <kent.overstreet@linux.dev> wrote:
> >
> > On Thu, Apr 25, 2024 at 09:51:56PM +0100, Matthew Wilcox wrote:
> > > On Thu, Apr 25, 2024 at 04:45:51PM -0400, Kent Overstreet wrote:
> > > > On Thu, Apr 25, 2024 at 01:08:50PM -0700, Kees Cook wrote:
> > > > > The /proc/allocinfo file exposes a tremendous about of information about
> > > > > kernel build details, memory allocations (obviously), and potentially
> > > > > even image layout (due to ordering). As this is intended to be consumed
> > > > > by system owners (like /proc/slabinfo), use the same file permissions as
> > > > > there: 0400.
> > > >
> > > > Err...
> > > >
> > > > The side effect of locking down more and more reporting interfaces is
> > > > that programs that consume those interfaces now have to run as root.
> > >
> > > sudo cat /proc/allocinfo | analyse-that-fie
> >
> > Even that is still an annoyance, but I'm thinking more about a future
> > daemon to collect this every n seconds - that really shouldn't need to
> > be root.
> 
> Yeah, that would preclude some nice usecases. Could we maybe use
> CAP_SYS_ADMIN checks instead? That way we can still use it from a
> non-root process?

A sysctl would be more in line with what we do for perf. Capabilities
aren't very usable, and CAP_SYS_ADMIN is already way too much of an
everything bucket.
Andrew Morton April 25, 2024, 9:38 p.m. UTC | #7
On Thu, 25 Apr 2024 14:21:39 -0700 Suren Baghdasaryan <surenb@google.com> wrote:

> > > > The side effect of locking down more and more reporting interfaces is
> > > > that programs that consume those interfaces now have to run as root.
> > >
> > > sudo cat /proc/allocinfo | analyse-that-fie
> >
> > Even that is still an annoyance, but I'm thinking more about a future
> > daemon to collect this every n seconds - that really shouldn't need to
> > be root.
> 
> Yeah, that would preclude some nice usecases. Could we maybe use
> CAP_SYS_ADMIN checks instead? That way we can still use it from a
> non-root process?

I'm inclined to keep Kees's 0400.  Yes it's a hassle but security is
always a hassle.  Let's not make Linux less secure, especially for
people who aren't even using /proc/allocinfo.

If someone really wants 0666 then they can chmod the thing from
initscripts, can't they?
Kent Overstreet April 25, 2024, 9:45 p.m. UTC | #8
On Thu, Apr 25, 2024 at 02:38:42PM -0700, Andrew Morton wrote:
> On Thu, 25 Apr 2024 14:21:39 -0700 Suren Baghdasaryan <surenb@google.com> wrote:
> 
> > > > > The side effect of locking down more and more reporting interfaces is
> > > > > that programs that consume those interfaces now have to run as root.
> > > >
> > > > sudo cat /proc/allocinfo | analyse-that-fie
> > >
> > > Even that is still an annoyance, but I'm thinking more about a future
> > > daemon to collect this every n seconds - that really shouldn't need to
> > > be root.
> > 
> > Yeah, that would preclude some nice usecases. Could we maybe use
> > CAP_SYS_ADMIN checks instead? That way we can still use it from a
> > non-root process?
> 
> I'm inclined to keep Kees's 0400.  Yes it's a hassle but security is
> always a hassle.  Let's not make Linux less secure, especially for
> people who aren't even using /proc/allocinfo.

That's a bit too trite; we've seen often enough that putting security
above all other concerns leads to worse outcomes in the long run; impair
usability too much and you're just causing more problems than you solve.

We need to take a balanced approach, like with everything else we do.

I'd really like to hear from Kees why pre-sorting the output so we aren't
leaking kernel image details wouldn't be sufficient.
Kees Cook April 25, 2024, 10:42 p.m. UTC | #9
On Thu, Apr 25, 2024 at 05:04:47PM -0400, Kent Overstreet wrote:
> On Thu, Apr 25, 2024 at 09:51:56PM +0100, Matthew Wilcox wrote:
> > On Thu, Apr 25, 2024 at 04:45:51PM -0400, Kent Overstreet wrote:
> > > On Thu, Apr 25, 2024 at 01:08:50PM -0700, Kees Cook wrote:
> > > > The /proc/allocinfo file exposes a tremendous about of information about
> > > > kernel build details, memory allocations (obviously), and potentially
> > > > even image layout (due to ordering). As this is intended to be consumed
> > > > by system owners (like /proc/slabinfo), use the same file permissions as
> > > > there: 0400.
> > > 
> > > Err...
> > > 
> > > The side effect of locking down more and more reporting interfaces is
> > > that programs that consume those interfaces now have to run as root.
> > 
> > sudo cat /proc/allocinfo | analyse-that-fie
> 
> Even that is still an annoyance, but I'm thinking more about a future
> daemon to collect this every n seconds - that really shouldn't need to
> be root.

Open it once and rewind? But regardless, see the end about changing
ownership/perms/etc.

> And the "lock everything down" approach really feels like paranoia gone
> too far - what's next, /proc/cpuinfo? Do we really want to go the
> Windows approach of UAC pop ups for everything? I'd rather be going the
> opposite direction, of making it as easy as possible for users to see
> what's going on with their machine.

Not unless there's something in /proc/cpuinfo that can't be extracted
in other methods. :) Anyway, you're offering a slippery-slope argument,
I just want to avoid new interfaces from having needlessly permissive
default access permissions.

I expect this to be enabled by default in distros, etc, so I'd like
to make sure it's not giving attackers more information than they
had before. Even reporting how much has been allocated at a specific
location means an attacker ends up with extremely accurate information
when attempting heap grooming, etc. Even the low granularity of slabinfo
is sufficient to improve attacks. (Which is why it's 0400 by default too.)

> Instead, why not a sysctl, like we already have for perf?

perf is a whole other beast, including syscalls, etc.

> The concern about leaking image layout could be addressed by sorting the
> output before returning to userspace.

It's trivial to change permissions from the default 0400 at boot time.
It can even have groups and ownership changed, etc. This is why we have
per-mount-namespace /proc instances:

# chgrp sysmonitor /proc/allocinfo
# chmod 0440 /proc/allocinfo

Poof, instant role-based access control. :)

I'm just trying to make the _default_ safe.
Kent Overstreet April 25, 2024, 11:02 p.m. UTC | #10
On Thu, Apr 25, 2024 at 03:42:30PM -0700, Kees Cook wrote:
> On Thu, Apr 25, 2024 at 05:04:47PM -0400, Kent Overstreet wrote:
> > On Thu, Apr 25, 2024 at 09:51:56PM +0100, Matthew Wilcox wrote:
> > > On Thu, Apr 25, 2024 at 04:45:51PM -0400, Kent Overstreet wrote:
> > > > On Thu, Apr 25, 2024 at 01:08:50PM -0700, Kees Cook wrote:
> > > > > The /proc/allocinfo file exposes a tremendous about of information about
> > > > > kernel build details, memory allocations (obviously), and potentially
> > > > > even image layout (due to ordering). As this is intended to be consumed
> > > > > by system owners (like /proc/slabinfo), use the same file permissions as
> > > > > there: 0400.
> > > > 
> > > > Err...
> > > > 
> > > > The side effect of locking down more and more reporting interfaces is
> > > > that programs that consume those interfaces now have to run as root.
> > > 
> > > sudo cat /proc/allocinfo | analyse-that-fie
> > 
> > Even that is still an annoyance, but I'm thinking more about a future
> > daemon to collect this every n seconds - that really shouldn't need to
> > be root.
> 
> Open it once and rewind? But regardless, see the end about changing
> ownership/perms/etc.
> 
> > And the "lock everything down" approach really feels like paranoia gone
> > too far - what's next, /proc/cpuinfo? Do we really want to go the
> > Windows approach of UAC pop ups for everything? I'd rather be going the
> > opposite direction, of making it as easy as possible for users to see
> > what's going on with their machine.
> 
> Not unless there's something in /proc/cpuinfo that can't be extracted
> in other methods. :) Anyway, you're offering a slippery-slope argument,
> I just want to avoid new interfaces from having needlessly permissive
> default access permissions.

No, I'm asking where the line is and how we decide what we want to
restrict. I hope you agree that it shouldn't be everything?

I'm pushing hard for making things easier to debug, and making it easier
for normal users to poke around and see what their system is doing.
Simpler, easier to understand tools, tools that teach users how the
system works. In my book, anything that lowers that barrier to entry is
a good thing.

I've had a lot of positive, useful interactions from this, and by
talking with users about what they see (I'm _always_ on IRC, and my
channel is pretty active). I can't overstate how useful this sort of
thing is, because there's only so many of us developers and users will
notice things that we don't - sometimes quite significant things! and if
we focus on making the system easy to debug, often times they'll do a
lot of that debugging for us.

Feeding user's curiosity pays great dividends in the long run.

And on the usability side, we've done a lot over the years to nudge
people away from running as root; if we set things up so that prefixing
every command with 'sudo' becomes the default, then we really lose out
on much of the benefit of that.

> I expect this to be enabled by default in distros, etc, so I'd like
> to make sure it's not giving attackers more information than they
> had before. Even reporting how much has been allocated at a specific
> location means an attacker ends up with extremely accurate information
> when attempting heap grooming, etc. Even the low granularity of slabinfo
> is sufficient to improve attacks. (Which is why it's 0400 by default too.)

This is pretty esoteric stuff to me; what users should be remotely
concerned about heap grooming attacks on their personal machines?

When we let the memory-unsafety of C (as well as spectre/meltdown)
dominate system design, the result is... not good. Those are things that
need to be fixed at the root (and, thank Rust, _are_), so let's maybe
not go _too_ overboard otherwise it's going to take another couple
decades to unwind all the resulting stockholm syndrome.

> > Instead, why not a sysctl, like we already have for perf?
> 
> perf is a whole other beast, including syscalls, etc.

But semantically it's the exact same thing. With perf we've got a knob
to say "I'm not paranoid on this machine, I don't want users to have to
be able to root to do basic debugging" - why not lean into that?

Have a global knob that's more than just for perf, so single user and
desktop machines can probably leave it loose (or even expose it in the
distro installer) and servers can have it on.

Give users a way to say what they want.

> > The concern about leaking image layout could be addressed by sorting the
> > output before returning to userspace.
> 
> It's trivial to change permissions from the default 0400 at boot time.
> It can even have groups and ownership changed, etc. This is why we have
> per-mount-namespace /proc instances:
> 
> # chgrp sysmonitor /proc/allocinfo
> # chmod 0440 /proc/allocinfo
> 
> Poof, instant role-based access control. :)

The trouble is that the average user isn't going to know what reporting
interfaces it's safe to do that with; we don't want to encourage that
kind of mucking around by the unsuspecting. We really should have a
knob.
Andrew Morton April 25, 2024, 11:47 p.m. UTC | #11
On Thu, 25 Apr 2024 15:42:30 -0700 Kees Cook <keescook@chromium.org> wrote:

> > The concern about leaking image layout could be addressed by sorting the
> > output before returning to userspace.
> 
> It's trivial to change permissions from the default 0400 at boot time.
> It can even have groups and ownership changed, etc. This is why we have
> per-mount-namespace /proc instances:
> 
> # chgrp sysmonitor /proc/allocinfo
> # chmod 0440 /proc/allocinfo
> 
> Poof, instant role-based access control. :)

Conversely, the paranoid could set it to 0400 at boot also.

> I'm just trying to make the _default_ safe.

Agree with this.

Semi-seriously, how about we set the permissions to 0000 and force
distributors/users to make a decision.
Kent Overstreet April 26, 2024, 12:27 a.m. UTC | #12
On Thu, Apr 25, 2024 at 04:47:18PM -0700, Andrew Morton wrote:
> On Thu, 25 Apr 2024 15:42:30 -0700 Kees Cook <keescook@chromium.org> wrote:
> 
> > > The concern about leaking image layout could be addressed by sorting the
> > > output before returning to userspace.
> > 
> > It's trivial to change permissions from the default 0400 at boot time.
> > It can even have groups and ownership changed, etc. This is why we have
> > per-mount-namespace /proc instances:
> > 
> > # chgrp sysmonitor /proc/allocinfo
> > # chmod 0440 /proc/allocinfo
> > 
> > Poof, instant role-based access control. :)
> 
> Conversely, the paranoid could set it to 0400 at boot also.
> 
> > I'm just trying to make the _default_ safe.
> 
> Agree with this.
> 
> Semi-seriously, how about we set the permissions to 0000 and force
> distributors/users to make a decision.

I'm ok with 0400 for now since it's consistent with slabinfo, but I'd
really like to see a sysctl for debug info paranoia. We shouldn't be
leaving this to the distros; we're the ones with the expertise to say
what would be covered by that sysctl.
Kees Cook April 26, 2024, 12:39 a.m. UTC | #13
On Thu, Apr 25, 2024 at 04:47:18PM -0700, Andrew Morton wrote:
> On Thu, 25 Apr 2024 15:42:30 -0700 Kees Cook <keescook@chromium.org> wrote:
> 
> > > The concern about leaking image layout could be addressed by sorting the
> > > output before returning to userspace.
> > 
> > It's trivial to change permissions from the default 0400 at boot time.
> > It can even have groups and ownership changed, etc. This is why we have
> > per-mount-namespace /proc instances:
> > 
> > # chgrp sysmonitor /proc/allocinfo
> > # chmod 0440 /proc/allocinfo
> > 
> > Poof, instant role-based access control. :)
> 
> Conversely, the paranoid could set it to 0400 at boot also.
> 
> > I'm just trying to make the _default_ safe.
> 
> Agree with this.
> 
> Semi-seriously, how about we set the permissions to 0000 and force
> distributors/users to make a decision.

That seems an overly unfriendly default, but I guess if you wanted it as
a Kconfig setting? Just seems easiest to make distros that enable alloc
profiling safe by default but trivially available to root, and specialized
monitoring systems can do whatever they want with their /proc file ACLs.

This is kind of how all of this stuff works. There's nothing unique to
/proc/allocinfo. We do the same for slabinfo, vmallocinfo, timer_list,
etc. And I think it's totally reasonable to have paranoid defaults,
give the kind of outrageous stuff attackers have figured out how to reverse
engineer. Take for example "we can bypass SLUB randomization for the slab
from which struct file is allocated [by examining the /proc/$pid/fdinfo/
file contents]" Jann Horn demonstrated[1].

-Kees

[1] https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html
Kees Cook April 26, 2024, 12:43 a.m. UTC | #14
On Thu, Apr 25, 2024 at 08:27:05PM -0400, Kent Overstreet wrote:
> On Thu, Apr 25, 2024 at 04:47:18PM -0700, Andrew Morton wrote:
> > On Thu, 25 Apr 2024 15:42:30 -0700 Kees Cook <keescook@chromium.org> wrote:
> > 
> > > > The concern about leaking image layout could be addressed by sorting the
> > > > output before returning to userspace.
> > > 
> > > It's trivial to change permissions from the default 0400 at boot time.
> > > It can even have groups and ownership changed, etc. This is why we have
> > > per-mount-namespace /proc instances:
> > > 
> > > # chgrp sysmonitor /proc/allocinfo
> > > # chmod 0440 /proc/allocinfo
> > > 
> > > Poof, instant role-based access control. :)
> > 
> > Conversely, the paranoid could set it to 0400 at boot also.
> > 
> > > I'm just trying to make the _default_ safe.
> > 
> > Agree with this.
> > 
> > Semi-seriously, how about we set the permissions to 0000 and force
> > distributors/users to make a decision.
> 
> I'm ok with 0400 for now since it's consistent with slabinfo, but I'd
> really like to see a sysctl for debug info paranoia. We shouldn't be
> leaving this to the distros; we're the ones with the expertise to say
> what would be covered by that sysctl.

We've not had great luck with sysctls (see userns sysctl discussions)
since they don't provide sufficient granularity.

All this said, I'm still not excited about any of these files living
in /proc at all -- we were supposed to use /sys for this kind of thing,
but its interface wasn't great for this kind of more "free-form" data,
and debugfs isn't good for production interfaces. /proc really should
only have pid information -- we end up exposing these top-level files to
every mount namespace with a /proc mount. :( But that's a yet-to-be-solved
problem...

-Kees
Kent Overstreet April 26, 2024, 12:58 a.m. UTC | #15
On Thu, Apr 25, 2024 at 05:43:33PM -0700, Kees Cook wrote:
> On Thu, Apr 25, 2024 at 08:27:05PM -0400, Kent Overstreet wrote:
> > On Thu, Apr 25, 2024 at 04:47:18PM -0700, Andrew Morton wrote:
> > > On Thu, 25 Apr 2024 15:42:30 -0700 Kees Cook <keescook@chromium.org> wrote:
> > > 
> > > > > The concern about leaking image layout could be addressed by sorting the
> > > > > output before returning to userspace.
> > > > 
> > > > It's trivial to change permissions from the default 0400 at boot time.
> > > > It can even have groups and ownership changed, etc. This is why we have
> > > > per-mount-namespace /proc instances:
> > > > 
> > > > # chgrp sysmonitor /proc/allocinfo
> > > > # chmod 0440 /proc/allocinfo
> > > > 
> > > > Poof, instant role-based access control. :)
> > > 
> > > Conversely, the paranoid could set it to 0400 at boot also.
> > > 
> > > > I'm just trying to make the _default_ safe.
> > > 
> > > Agree with this.
> > > 
> > > Semi-seriously, how about we set the permissions to 0000 and force
> > > distributors/users to make a decision.
> > 
> > I'm ok with 0400 for now since it's consistent with slabinfo, but I'd
> > really like to see a sysctl for debug info paranoia. We shouldn't be
> > leaving this to the distros; we're the ones with the expertise to say
> > what would be covered by that sysctl.
> 
> We've not had great luck with sysctls (see userns sysctl discussions)
> since they don't provide sufficient granularity.
> 
> All this said, I'm still not excited about any of these files living
> in /proc at all -- we were supposed to use /sys for this kind of thing,
> but its interface wasn't great for this kind of more "free-form" data,
> and debugfs isn't good for production interfaces. /proc really should
> only have pid information -- we end up exposing these top-level files to
> every mount namespace with a /proc mount. :( But that's a yet-to-be-solved
> problem...

It really wouldn't be that hard to relax the 4k file limit in sysfs.
Matthew Wilcox (Oracle) April 26, 2024, 3:25 a.m. UTC | #16
On Thu, Apr 25, 2024 at 08:58:34PM -0400, Kent Overstreet wrote:
> On Thu, Apr 25, 2024 at 05:43:33PM -0700, Kees Cook wrote:
> > All this said, I'm still not excited about any of these files living
> > in /proc at all -- we were supposed to use /sys for this kind of thing,
> > but its interface wasn't great for this kind of more "free-form" data,
> > and debugfs isn't good for production interfaces. /proc really should
> > only have pid information -- we end up exposing these top-level files to
> > every mount namespace with a /proc mount. :( But that's a yet-to-be-solved
> > problem...
> 
> It really wouldn't be that hard to relax the 4k file limit in sysfs.

It's a lot harder to relax the GregKH opposition to multiple values per
file in sysfs.
Kent Overstreet April 26, 2024, 3:35 a.m. UTC | #17
On Fri, Apr 26, 2024 at 04:25:40AM +0100, Matthew Wilcox wrote:
> On Thu, Apr 25, 2024 at 08:58:34PM -0400, Kent Overstreet wrote:
> > On Thu, Apr 25, 2024 at 05:43:33PM -0700, Kees Cook wrote:
> > > All this said, I'm still not excited about any of these files living
> > > in /proc at all -- we were supposed to use /sys for this kind of thing,
> > > but its interface wasn't great for this kind of more "free-form" data,
> > > and debugfs isn't good for production interfaces. /proc really should
> > > only have pid information -- we end up exposing these top-level files to
> > > every mount namespace with a /proc mount. :( But that's a yet-to-be-solved
> > > problem...
> > 
> > It really wouldn't be that hard to relax the 4k file limit in sysfs.
> 
> It's a lot harder to relax the GregKH opposition to multiple values per
> file in sysfs.

Which makes no sense for columnar data.
Pavel Machek April 26, 2024, 8:32 a.m. UTC | #18
Hi!

> > > > > The /proc/allocinfo file exposes a tremendous about of information about
> > > > > kernel build details, memory allocations (obviously), and potentially
> > > > > even image layout (due to ordering). As this is intended to be consumed
> > > > > by system owners (like /proc/slabinfo), use the same file permissions as
> > > > > there: 0400.
> > > >
> > > > Err...
> > > >
> > > > The side effect of locking down more and more reporting interfaces is
> > > > that programs that consume those interfaces now have to run as root.
> > >
> > > sudo cat /proc/allocinfo | analyse-that-fie
> >
> > Even that is still an annoyance, but I'm thinking more about a future
> > daemon to collect this every n seconds - that really shouldn't need to
> > be root.
> 
> Yeah, that would preclude some nice usecases. Could we maybe use
> CAP_SYS_ADMIN checks instead? That way we can still use it from a
> non-root process?

CAP_SYS_ADMIN is really not suitable, as it can do changes to the
system. On working system, allocinfo is really not dangerous, it just
may make exploits harder. CAP_KERNEL_OBSERVER or something...

								Pavel
Pavel Machek April 26, 2024, 8:34 a.m. UTC | #19
On Fri 2024-04-26 04:25:40, Matthew Wilcox wrote:
> On Thu, Apr 25, 2024 at 08:58:34PM -0400, Kent Overstreet wrote:
> > On Thu, Apr 25, 2024 at 05:43:33PM -0700, Kees Cook wrote:
> > > All this said, I'm still not excited about any of these files living
> > > in /proc at all -- we were supposed to use /sys for this kind of thing,
> > > but its interface wasn't great for this kind of more "free-form" data,
> > > and debugfs isn't good for production interfaces. /proc really should
> > > only have pid information -- we end up exposing these top-level files to
> > > every mount namespace with a /proc mount. :( But that's a yet-to-be-solved
> > > problem...
> > 
> > It really wouldn't be that hard to relax the 4k file limit in sysfs.
> 
> It's a lot harder to relax the GregKH opposition to multiple values per
> file in sysfs.

With all the "vulnerability" files including multiple-files with
english text, you may be able to renegotiate that :-).

Joking, really the vulnerability files should be fixed.

									Pavel
Kent Overstreet April 26, 2024, 8:46 a.m. UTC | #20
On Fri, Apr 26, 2024 at 10:32:27AM +0200, Pavel Machek wrote:
> Hi!
> 
> > > > > > The /proc/allocinfo file exposes a tremendous about of information about
> > > > > > kernel build details, memory allocations (obviously), and potentially
> > > > > > even image layout (due to ordering). As this is intended to be consumed
> > > > > > by system owners (like /proc/slabinfo), use the same file permissions as
> > > > > > there: 0400.
> > > > >
> > > > > Err...
> > > > >
> > > > > The side effect of locking down more and more reporting interfaces is
> > > > > that programs that consume those interfaces now have to run as root.
> > > >
> > > > sudo cat /proc/allocinfo | analyse-that-fie
> > >
> > > Even that is still an annoyance, but I'm thinking more about a future
> > > daemon to collect this every n seconds - that really shouldn't need to
> > > be root.
> > 
> > Yeah, that would preclude some nice usecases. Could we maybe use
> > CAP_SYS_ADMIN checks instead? That way we can still use it from a
> > non-root process?
> 
> CAP_SYS_ADMIN is really not suitable, as it can do changes to the
> system. On working system, allocinfo is really not dangerous, it just
> may make exploits harder. CAP_KERNEL_OBSERVER or something...

There's _really_ no reason to use capabilities at all for something that
has file ownership - just use a group.
diff mbox series

Patch

diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index 26af9982ddc4..531dbe2f5456 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -129,7 +129,7 @@  size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sl
 
 static void __init procfs_init(void)
 {
-	proc_create_seq("allocinfo", 0444, NULL, &allocinfo_seq_op);
+	proc_create_seq("allocinfo", 0400, NULL, &allocinfo_seq_op);
 }
 
 static bool alloc_tag_module_unload(struct codetag_type *cttype,