mbox series

[RFC,userspace,0/6] Parallel setfiles/restorecon

Message ID 20210323170830.182553-1-omosnace@redhat.com (mailing list archive)
Headers show
Series Parallel setfiles/restorecon | expand

Message

Ondrej Mosnacek March 23, 2021, 5:08 p.m. UTC
This series adds basic support for parallel relabeling to the libselinux
API and the setfiles/restorecon CLI tools. It turns out that doing the
relabeling in parallel can significantly reduce the time even with a
relatively simple approach.

The first patch is a small cleanup that was found along the way and can
be applied independently. Patches 2-4 are small incremental changes that
make the internal selinux_restorecon functions more thread-safe (I kept
them separate for ease of of review, but maybe they should be rather
folded into the netx patch...). Patch 5 then completes the parallel
relabeling implementation at libselinux level and adds a new function
to the API that allows to make use of it. Finally, patch 6 adds parallel
relabeling support to he setfiles/restorecon tools.

The relevant man pages are also updated to reflect the new
functionality.

The patch descriptions contain more details, namely the last patch has
also some benchmark numbers.

Please test and review. I'm still not fully confident I got everything
right (esp. regarding error handling), but I wanted to put this forward
as an RFC to get some early feedback.

Ondrej Mosnacek (6):
  selinux_restorecon: simplify fl_head allocation by using calloc()
  selinux_restorecon: protect file_spec list with a mutex
  selinux_restorecon: introduce selinux_log_sync()
  selinux_restorecon: add a global mutex to synchronize progress output
  selinux_restorecon: introduce selinux_restorecon_parallel(3)
  setfiles/restorecon: support parallel relabeling

 libselinux/include/selinux/restorecon.h       |  14 +
 libselinux/man/man3/selinux_restorecon.3      |  29 +
 .../man/man3/selinux_restorecon_parallel.3    |   1 +
 libselinux/src/libselinux.map                 |   5 +
 libselinux/src/selinux_internal.h             |  14 +
 libselinux/src/selinux_restorecon.c           | 498 ++++++++++++------
 policycoreutils/setfiles/Makefile             |   2 +-
 policycoreutils/setfiles/restore.c            |   7 +-
 policycoreutils/setfiles/restore.h            |   2 +-
 policycoreutils/setfiles/restorecon.8         |   9 +
 policycoreutils/setfiles/setfiles.8           |   9 +
 policycoreutils/setfiles/setfiles.c           |  28 +-
 12 files changed, 436 insertions(+), 182 deletions(-)
 create mode 100644 libselinux/man/man3/selinux_restorecon_parallel.3

Comments

Peter Enderborg March 24, 2021, 9:58 a.m. UTC | #1
On 3/23/21 6:08 PM, Ondrej Mosnacek wrote:
> This series adds basic support for parallel relabeling to the libselinux
> API and the setfiles/restorecon CLI tools. It turns out that doing the
> relabeling in parallel can significantly reduce the time even with a
> relatively simple approach.
Nice! Have you any figures? Is it valid for both solid state and mechanical storage?
> The first patch is a small cleanup that was found along the way and can
> be applied independently. Patches 2-4 are small incremental changes that
> make the internal selinux_restorecon functions more thread-safe (I kept
> them separate for ease of of review, but maybe they should be rather
> folded into the netx patch...). Patch 5 then completes the parallel
> relabeling implementation at libselinux level and adds a new function
> to the API that allows to make use of it. Finally, patch 6 adds parallel
> relabeling support to he setfiles/restorecon tools.
>
> The relevant man pages are also updated to reflect the new
> functionality.
>
> The patch descriptions contain more details, namely the last patch has
> also some benchmark numbers.
>
> Please test and review. I'm still not fully confident I got everything
> right (esp. regarding error handling), but I wanted to put this forward
> as an RFC to get some early feedback.
>
> Ondrej Mosnacek (6):
>   selinux_restorecon: simplify fl_head allocation by using calloc()
>   selinux_restorecon: protect file_spec list with a mutex
>   selinux_restorecon: introduce selinux_log_sync()
>   selinux_restorecon: add a global mutex to synchronize progress output
>   selinux_restorecon: introduce selinux_restorecon_parallel(3)
>   setfiles/restorecon: support parallel relabeling
>
>  libselinux/include/selinux/restorecon.h       |  14 +
>  libselinux/man/man3/selinux_restorecon.3      |  29 +
>  .../man/man3/selinux_restorecon_parallel.3    |   1 +
>  libselinux/src/libselinux.map                 |   5 +
>  libselinux/src/selinux_internal.h             |  14 +
>  libselinux/src/selinux_restorecon.c           | 498 ++++++++++++------
>  policycoreutils/setfiles/Makefile             |   2 +-
>  policycoreutils/setfiles/restore.c            |   7 +-
>  policycoreutils/setfiles/restore.h            |   2 +-
>  policycoreutils/setfiles/restorecon.8         |   9 +
>  policycoreutils/setfiles/setfiles.8           |   9 +
>  policycoreutils/setfiles/setfiles.c           |  28 +-
>  12 files changed, 436 insertions(+), 182 deletions(-)
>  create mode 100644 libselinux/man/man3/selinux_restorecon_parallel.3
>
Ondrej Mosnacek March 24, 2021, 11:04 a.m. UTC | #2
On Wed, Mar 24, 2021 at 10:59 AM peter enderborg
<peter.enderborg@sony.com> wrote:
> On 3/23/21 6:08 PM, Ondrej Mosnacek wrote:
> > This series adds basic support for parallel relabeling to the libselinux
> > API and the setfiles/restorecon CLI tools. It turns out that doing the
> > relabeling in parallel can significantly reduce the time even with a
> > relatively simple approach.
> Nice! Have you any figures? Is it valid for both solid state and mechanical storage?

They are in the last patch :) The VM setup I measured that on probably
had the storage backed up by an SSD (or something with similar
characteristics). I haven't tried it on an HDD yet.

> > The first patch is a small cleanup that was found along the way and can
> > be applied independently. Patches 2-4 are small incremental changes that
> > make the internal selinux_restorecon functions more thread-safe (I kept
> > them separate for ease of of review, but maybe they should be rather
> > folded into the netx patch...). Patch 5 then completes the parallel
> > relabeling implementation at libselinux level and adds a new function
> > to the API that allows to make use of it. Finally, patch 6 adds parallel
> > relabeling support to he setfiles/restorecon tools.
> >
> > The relevant man pages are also updated to reflect the new
> > functionality.
> >
> > The patch descriptions contain more details, namely the last patch has
> > also some benchmark numbers.
> >
> > Please test and review. I'm still not fully confident I got everything
> > right (esp. regarding error handling), but I wanted to put this forward
> > as an RFC to get some early feedback.
> >
> > Ondrej Mosnacek (6):
> >   selinux_restorecon: simplify fl_head allocation by using calloc()
> >   selinux_restorecon: protect file_spec list with a mutex
> >   selinux_restorecon: introduce selinux_log_sync()
> >   selinux_restorecon: add a global mutex to synchronize progress output
> >   selinux_restorecon: introduce selinux_restorecon_parallel(3)
> >   setfiles/restorecon: support parallel relabeling
> >
> >  libselinux/include/selinux/restorecon.h       |  14 +
> >  libselinux/man/man3/selinux_restorecon.3      |  29 +
> >  .../man/man3/selinux_restorecon_parallel.3    |   1 +
> >  libselinux/src/libselinux.map                 |   5 +
> >  libselinux/src/selinux_internal.h             |  14 +
> >  libselinux/src/selinux_restorecon.c           | 498 ++++++++++++------
> >  policycoreutils/setfiles/Makefile             |   2 +-
> >  policycoreutils/setfiles/restore.c            |   7 +-
> >  policycoreutils/setfiles/restore.h            |   2 +-
> >  policycoreutils/setfiles/restorecon.8         |   9 +
> >  policycoreutils/setfiles/setfiles.8           |   9 +
> >  policycoreutils/setfiles/setfiles.c           |  28 +-
> >  12 files changed, 436 insertions(+), 182 deletions(-)
> >  create mode 100644 libselinux/man/man3/selinux_restorecon_parallel.3
> >
>


--
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.
Nicolas Iooss April 28, 2021, 9:11 p.m. UTC | #3
On Wed, Mar 24, 2021 at 12:05 PM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>
> On Wed, Mar 24, 2021 at 10:59 AM peter enderborg
> <peter.enderborg@sony.com> wrote:
> > On 3/23/21 6:08 PM, Ondrej Mosnacek wrote:
> > > This series adds basic support for parallel relabeling to the libselinux
> > > API and the setfiles/restorecon CLI tools. It turns out that doing the
> > > relabeling in parallel can significantly reduce the time even with a
> > > relatively simple approach.
> > Nice! Have you any figures? Is it valid for both solid state and mechanical storage?
>
> They are in the last patch :) The VM setup I measured that on probably
> had the storage backed up by an SSD (or something with similar
> characteristics). I haven't tried it on an HDD yet.
>
> > > The first patch is a small cleanup that was found along the way and can
> > > be applied independently. Patches 2-4 are small incremental changes that
> > > make the internal selinux_restorecon functions more thread-safe (I kept
> > > them separate for ease of of review, but maybe they should be rather
> > > folded into the netx patch...). Patch 5 then completes the parallel
> > > relabeling implementation at libselinux level and adds a new function
> > > to the API that allows to make use of it. Finally, patch 6 adds parallel
> > > relabeling support to he setfiles/restorecon tools.
> > >
> > > The relevant man pages are also updated to reflect the new
> > > functionality.
> > >
> > > The patch descriptions contain more details, namely the last patch has
> > > also some benchmark numbers.
> > >
> > > Please test and review. I'm still not fully confident I got everything
> > > right (esp. regarding error handling), but I wanted to put this forward
> > > as an RFC to get some early feedback.
> > >
> > > Ondrej Mosnacek (6):
> > >   selinux_restorecon: simplify fl_head allocation by using calloc()
> > >   selinux_restorecon: protect file_spec list with a mutex
> > >   selinux_restorecon: introduce selinux_log_sync()
> > >   selinux_restorecon: add a global mutex to synchronize progress output
> > >   selinux_restorecon: introduce selinux_restorecon_parallel(3)
> > >   setfiles/restorecon: support parallel relabeling
> > >
> > >  libselinux/include/selinux/restorecon.h       |  14 +
> > >  libselinux/man/man3/selinux_restorecon.3      |  29 +
> > >  .../man/man3/selinux_restorecon_parallel.3    |   1 +
> > >  libselinux/src/libselinux.map                 |   5 +
> > >  libselinux/src/selinux_internal.h             |  14 +
> > >  libselinux/src/selinux_restorecon.c           | 498 ++++++++++++------
> > >  policycoreutils/setfiles/Makefile             |   2 +-
> > >  policycoreutils/setfiles/restore.c            |   7 +-
> > >  policycoreutils/setfiles/restore.h            |   2 +-
> > >  policycoreutils/setfiles/restorecon.8         |   9 +
> > >  policycoreutils/setfiles/setfiles.8           |   9 +
> > >  policycoreutils/setfiles/setfiles.c           |  28 +-
> > >  12 files changed, 436 insertions(+), 182 deletions(-)
> > >  create mode 100644 libselinux/man/man3/selinux_restorecon_parallel.3
> > >
> >

Hello,
I haven't seen any review of this RFC, so I decided to take a look.
The result looks quite good :) What is the current status of this
series, and can it become a "to-be-merged" patch series?

Anyway, here are some comments.

First, I was a little puzzled by the introduction of
selinux_log_sync() and the fact that it is used by
selinux_restorecon_common(), which means that callers of
selinux_restorecon() will also take the mutex log_mutex. This
surprised me because the description of one commit was very clear
about not depending very hard on libpthread... but in fact your code
is right there. I have just re-discovered the pthread helpers in
libselinux/src/selinux_internal.h :)

Nevertheless, now that I saw the pthread helpers, which enable using
libselinux without linking with libpthread, I am wondering: why is
introducing selinux_log_sync() like you do (and keeping calls to
selinux_log_sync() and selinux_log() in the code and praying that all
invocations from parallel code are converted to selinux_log_sync() )
better than introducing the mutex directly "in selinux_log()"? I
understand this is not so easy, because selinux_log is in fact a
function pointer... what I have in my mind consists in renaming the
pointer and in renaming a selinux_log_sync() to selinux_log(). This
would make the code less error-prone, regarding the issue of ensuring
to never call selinux_log callback in two parallel threads. What do
you think?

Then, when I compiled your patches with clang and some warning flags,
I got this warning:

selinux_restorecon.c:867:19: error: possible misuse of comma operator
here [-Werror,-Wcomma]
        while ((errno = 0, ftsent = fts_read(fts)) != NULL) {
                         ^
selinux_restorecon.c:867:10: note: cast expression to void to silence warning
        while ((errno = 0, ftsent = fts_read(fts)) != NULL) {
                ^~~~~~~~~
                (void)(  )
/usr/include/errno.h:38:16: note: expanded from macro 'errno'
# define errno (*__errno_location ())
               ^
1 error generated.

Using a comma operator seems to be right, here, and using the
suggested workaround to silence the compiler warning seems to be fine.

Finally, the generated file
libselinux/src/selinuxswig_python_exception.i needs to be upgraded,
because the new function selinux_restorecon_parallel is being added to
it. It would be nice to have a patch which updates this file, or to
have this update in patch "selinux_restorecon: introduce
selinux_restorecon_parallel(3)" (your choice).

Thanks!
Nicolas
Ondrej Mosnacek April 30, 2021, 12:49 p.m. UTC | #4
On Wed, Apr 28, 2021 at 11:12 PM Nicolas Iooss <nicolas.iooss@m4x.org> wrote:
> On Wed, Mar 24, 2021 at 12:05 PM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> >
> > On Wed, Mar 24, 2021 at 10:59 AM peter enderborg
> > <peter.enderborg@sony.com> wrote:
> > > On 3/23/21 6:08 PM, Ondrej Mosnacek wrote:
> > > > This series adds basic support for parallel relabeling to the libselinux
> > > > API and the setfiles/restorecon CLI tools. It turns out that doing the
> > > > relabeling in parallel can significantly reduce the time even with a
> > > > relatively simple approach.
> > > Nice! Have you any figures? Is it valid for both solid state and mechanical storage?
> >
> > They are in the last patch :) The VM setup I measured that on probably
> > had the storage backed up by an SSD (or something with similar
> > characteristics). I haven't tried it on an HDD yet.
> >
> > > > The first patch is a small cleanup that was found along the way and can
> > > > be applied independently. Patches 2-4 are small incremental changes that
> > > > make the internal selinux_restorecon functions more thread-safe (I kept
> > > > them separate for ease of of review, but maybe they should be rather
> > > > folded into the netx patch...). Patch 5 then completes the parallel
> > > > relabeling implementation at libselinux level and adds a new function
> > > > to the API that allows to make use of it. Finally, patch 6 adds parallel
> > > > relabeling support to he setfiles/restorecon tools.
> > > >
> > > > The relevant man pages are also updated to reflect the new
> > > > functionality.
> > > >
> > > > The patch descriptions contain more details, namely the last patch has
> > > > also some benchmark numbers.
> > > >
> > > > Please test and review. I'm still not fully confident I got everything
> > > > right (esp. regarding error handling), but I wanted to put this forward
> > > > as an RFC to get some early feedback.
> > > >
> > > > Ondrej Mosnacek (6):
> > > >   selinux_restorecon: simplify fl_head allocation by using calloc()
> > > >   selinux_restorecon: protect file_spec list with a mutex
> > > >   selinux_restorecon: introduce selinux_log_sync()
> > > >   selinux_restorecon: add a global mutex to synchronize progress output
> > > >   selinux_restorecon: introduce selinux_restorecon_parallel(3)
> > > >   setfiles/restorecon: support parallel relabeling
> > > >
> > > >  libselinux/include/selinux/restorecon.h       |  14 +
> > > >  libselinux/man/man3/selinux_restorecon.3      |  29 +
> > > >  .../man/man3/selinux_restorecon_parallel.3    |   1 +
> > > >  libselinux/src/libselinux.map                 |   5 +
> > > >  libselinux/src/selinux_internal.h             |  14 +
> > > >  libselinux/src/selinux_restorecon.c           | 498 ++++++++++++------
> > > >  policycoreutils/setfiles/Makefile             |   2 +-
> > > >  policycoreutils/setfiles/restore.c            |   7 +-
> > > >  policycoreutils/setfiles/restore.h            |   2 +-
> > > >  policycoreutils/setfiles/restorecon.8         |   9 +
> > > >  policycoreutils/setfiles/setfiles.8           |   9 +
> > > >  policycoreutils/setfiles/setfiles.c           |  28 +-
> > > >  12 files changed, 436 insertions(+), 182 deletions(-)
> > > >  create mode 100644 libselinux/man/man3/selinux_restorecon_parallel.3
> > > >
> > >
>
> Hello,
> I haven't seen any review of this RFC, so I decided to take a look.
> The result looks quite good :) What is the current status of this
> series, and can it become a "to-be-merged" patch series?

I hope it can :) I posted it as an RFC initially mainly to get some
high-level feedback whether I'm going in the right direction. But
based on your reply it seems it's already close enough to
acceptability that I could drop the RFC in the next revision. I also
didn't test it all that thoroughly, especially w.r.t. various
corner-cases...

>
> Anyway, here are some comments.
>
> First, I was a little puzzled by the introduction of
> selinux_log_sync() and the fact that it is used by
> selinux_restorecon_common(), which means that callers of
> selinux_restorecon() will also take the mutex log_mutex. This
> surprised me because the description of one commit was very clear
> about not depending very hard on libpthread... but in fact your code
> is right there. I have just re-discovered the pthread helpers in
> libselinux/src/selinux_internal.h :)
>
> Nevertheless, now that I saw the pthread helpers, which enable using
> libselinux without linking with libpthread, I am wondering: why is
> introducing selinux_log_sync() like you do (and keeping calls to
> selinux_log_sync() and selinux_log() in the code and praying that all
> invocations from parallel code are converted to selinux_log_sync() )
> better than introducing the mutex directly "in selinux_log()"? I
> understand this is not so easy, because selinux_log is in fact a
> function pointer... what I have in my mind consists in renaming the
> pointer and in renaming a selinux_log_sync() to selinux_log(). This
> would make the code less error-prone, regarding the issue of ensuring
> to never call selinux_log callback in two parallel threads. What do
> you think?

That's a good question. Since this will be the first function in
libselinux with some internal parallelism, I guess I just didn't want
to affect the status quo of existing code too much... Indeed the lock
would be taken also in the single-threaded implementation, but since
in selinux_restorecon() selinux_log() is called only in non-hot paths,
I didn't bother optimizing that.

Anyway, I agree that making selinux_log() synchronized globally may be
a good idea. The cost should be minimal and it would prevent
accidental race conditions. If some reasonable quorum of maintainers
agrees, I will make that change in v2.

> Then, when I compiled your patches with clang and some warning flags,
> I got this warning:
>
> selinux_restorecon.c:867:19: error: possible misuse of comma operator
> here [-Werror,-Wcomma]
>         while ((errno = 0, ftsent = fts_read(fts)) != NULL) {
>                          ^
> selinux_restorecon.c:867:10: note: cast expression to void to silence warning
>         while ((errno = 0, ftsent = fts_read(fts)) != NULL) {
>                 ^~~~~~~~~
>                 (void)(  )
> /usr/include/errno.h:38:16: note: expanded from macro 'errno'
> # define errno (*__errno_location ())
>                ^
> 1 error generated.
>
> Using a comma operator seems to be right, here, and using the
> suggested workaround to silence the compiler warning seems to be fine.

Hm, yes, I think I'll just add the cast to void there... Not a big fan
of the comma operator, but I couldn't resist the temptation to use it
here :)

> Finally, the generated file
> libselinux/src/selinuxswig_python_exception.i needs to be upgraded,
> because the new function selinux_restorecon_parallel is being added to
> it. It would be nice to have a patch which updates this file, or to
> have this update in patch "selinux_restorecon: introduce
> selinux_restorecon_parallel(3)" (your choice).

Ah, thanks for pointing that out. I'll address it in v2.

--
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.