Message ID | 20200327152107.95915-1-omosnace@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | userspace: Implement new format of filename trans rules | expand |
On 3/27/20 11:21 AM, Ondrej Mosnacek wrote: > These patches are the userspace side of the kernel change posted at [1]. > > The first patch changes libsepol's internal representation of filename > transition rules in a way similar to kernel commit c3a276111ea2 > ("selinux: optimize storage of filename transitions") [2]. > > The second patch then builds upon that and implements reading and > writing of a new binary policy format that uses this representation also > in the data layout. > > See individual patches for more details. > > NOTE: This series unfortunately breaks the build of setools. Moreover, > when an existing build of setools dynamically links against the new > libsepol, it segfaults. Sadly, there doesn't seem to be a nice way of > handling this, since setools relies on non-public libsepol policydb > API/ABI. I think this has happened before a few years ago when we made a different change to those structures, and required updates on the setools side. Maybe we need to figure out what setools needs to be encapsulated and exported as part of the libsepol public ABI/API, and then stop having it peer into libsepol internals?
On 3/27/20 3:21 PM, Stephen Smalley wrote: > On 3/27/20 11:21 AM, Ondrej Mosnacek wrote: >> These patches are the userspace side of the kernel change posted at [1]. >> >> The first patch changes libsepol's internal representation of filename >> transition rules in a way similar to kernel commit c3a276111ea2 >> ("selinux: optimize storage of filename transitions") [2]. >> >> The second patch then builds upon that and implements reading and >> writing of a new binary policy format that uses this representation also >> in the data layout. >> >> See individual patches for more details. >> >> NOTE: This series unfortunately breaks the build of setools. Moreover, >> when an existing build of setools dynamically links against the new >> libsepol, it segfaults. Sadly, there doesn't seem to be a nice way of >> handling this, since setools relies on non-public libsepol policydb >> API/ABI. > > I think this has happened before a few years ago when we made a > different change to those structures, and required updates on the > setools side. > > Maybe we need to figure out what setools needs to be encapsulated and > exported as part of the libsepol public ABI/API, and then stop having it > peer into libsepol internals? I'd be fine with that :) Ultimately SETools does 2 thing for the policy access: * iterate over the entire policy, reading data out of the various objects * use linkages between objects in the policy, e.g. get the type/attributes from an AV rule, get types from an attribute, etc. using the stuff like class_val_to_struct as needed. So if sufficient functionality to do dispol was exported, that would get close to all that was needed. There are some optimizations I made by digging at the structs a bit more than the above, but that could potentially be dropped if libsepol has sufficient support. See <https://github.com/SELinuxProject/setools/blob/master/setools/policyrep/selinuxpolicy.pxi#L673> for the policy loading code.
On Mon, Mar 30, 2020 at 9:06 AM Chris PeBenito <pebenito@ieee.org> wrote: > > On 3/27/20 3:21 PM, Stephen Smalley wrote: > > On 3/27/20 11:21 AM, Ondrej Mosnacek wrote: > >> These patches are the userspace side of the kernel change posted at [1]. > >> > >> The first patch changes libsepol's internal representation of filename > >> transition rules in a way similar to kernel commit c3a276111ea2 > >> ("selinux: optimize storage of filename transitions") [2]. > >> > >> The second patch then builds upon that and implements reading and > >> writing of a new binary policy format that uses this representation also > >> in the data layout. > >> > >> See individual patches for more details. > >> > >> NOTE: This series unfortunately breaks the build of setools. Moreover, > >> when an existing build of setools dynamically links against the new > >> libsepol, it segfaults. Sadly, there doesn't seem to be a nice way of > >> handling this, since setools relies on non-public libsepol policydb > >> API/ABI. > > > > I think this has happened before a few years ago when we made a > > different change to those structures, and required updates on the > > setools side. > > > > Maybe we need to figure out what setools needs to be encapsulated and > > exported as part of the libsepol public ABI/API, and then stop having it > > peer into libsepol internals? > > I'd be fine with that :) > I am a little confused. I would consider peering into the libsepol internals to be something like assuming the memory layout of structures and pulling things out based on those assumptions, but setools is just using public header files and functions. It seems to me to be using the public API. Now I think that too much of the internals are being exposed, but I am not sure we can do anything about that now. It doesn't seem like it would be too hard to update setools to work with this change, unless there is a requirement that it still work with older versions of libsepol. I am not very familiar with cython and I don't know how to make it work differently depending on what version of libsepol is on the system. Speaking of versions, since we are actually modifying structs in the header files instead of just adding new things, don't we need change the version of libsepol or something? > Ultimately SETools does 2 thing for the policy access: > * iterate over the entire policy, reading data out of the various objects > * use linkages between objects in the policy, e.g. get the > type/attributes from an AV rule, get types from an attribute, etc. using > the stuff like class_val_to_struct as needed. > > So if sufficient functionality to do dispol was exported, that would get > close to all that was needed. There are some optimizations I made by > digging at the structs a bit more than the above, but that could > potentially be dropped if libsepol has sufficient support. See > <https://github.com/SELinuxProject/setools/blob/master/setools/policyrep/selinuxpolicy.pxi#L673> > for the policy loading code. > I think the fact that the CIL, kernel to CIL, kernel to conf, and module to CIL code is all in libsepol speaks to the fact of how tightly integrated they are to the rest of libsepol. One argument that could be made is that the policyrep stuff in setools really belongs in libsepol. Thinking about how libsepol could be encapsulated leads me to a couple of possibilities. One way would be functions that could return lists of rules. The policy module code gives us avrule_t, role_trans_rule_t, role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. Those structures are probably unlikely to change and, at least in this case, creating a function that walks the filename_trans hashtable and returns a list of filename_trans_rule_t certainly seems like it wouldn't be too hard. Another possible way to encapsulate would be create a way to walk the various hashtables element by element (I think hashtab_map() requires too much knowledge of the internal structures), returning an opaque structure to track where you are in the hashtable and functions that allow you to get each part of the rule being stored. There are other ways that it could be done, but I could rewrite kernel to and module to stuff with either of those. CIL itself would require some functions to insert rules into the policydb which probably wouldn't be too hard. None of this would be too hard, but it would take some time. The real question is would it really be valuable? Jim
On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: > > On Mon, Mar 30, 2020 at 9:06 AM Chris PeBenito <pebenito@ieee.org> wrote: > > > > On 3/27/20 3:21 PM, Stephen Smalley wrote: > > > On 3/27/20 11:21 AM, Ondrej Mosnacek wrote: > > >> These patches are the userspace side of the kernel change posted at [1]. > > >> > > >> The first patch changes libsepol's internal representation of filename > > >> transition rules in a way similar to kernel commit c3a276111ea2 > > >> ("selinux: optimize storage of filename transitions") [2]. > > >> > > >> The second patch then builds upon that and implements reading and > > >> writing of a new binary policy format that uses this representation also > > >> in the data layout. > > >> > > >> See individual patches for more details. > > >> > > >> NOTE: This series unfortunately breaks the build of setools. Moreover, > > >> when an existing build of setools dynamically links against the new > > >> libsepol, it segfaults. Sadly, there doesn't seem to be a nice way of > > >> handling this, since setools relies on non-public libsepol policydb > > >> API/ABI. > > > > > > I think this has happened before a few years ago when we made a > > > different change to those structures, and required updates on the > > > setools side. > > > > > > Maybe we need to figure out what setools needs to be encapsulated and > > > exported as part of the libsepol public ABI/API, and then stop having it > > > peer into libsepol internals? > > > > I'd be fine with that :) > > > > I am a little confused. I would consider peering into the libsepol > internals to be something like assuming the memory layout of > structures and pulling things out based on those assumptions, but > setools is just using public header files and functions. It seems to > me to be using the public API. Now I think that too much of the > internals are being exposed, but I am not sure we can do anything > about that now. The sepol/policydb/*.h files are considered private and are only for static users of libsepol like checkpolicy. Only the top-level sepol/*.h headers are part of the shared library API/ABI. > It doesn't seem like it would be too hard to update setools to work > with this change, unless there is a requirement that it still work > with older versions of libsepol. I am not very familiar with cython > and I don't know how to make it work differently depending on what > version of libsepol is on the system. > > Speaking of versions, since we are actually modifying structs in the > header files instead of just adding new things, don't we need change > the version of libsepol or something? > > > Ultimately SETools does 2 thing for the policy access: > > * iterate over the entire policy, reading data out of the various objects > > * use linkages between objects in the policy, e.g. get the > > type/attributes from an AV rule, get types from an attribute, etc. using > > the stuff like class_val_to_struct as needed. > > > > So if sufficient functionality to do dispol was exported, that would get > > close to all that was needed. There are some optimizations I made by > > digging at the structs a bit more than the above, but that could > > potentially be dropped if libsepol has sufficient support. See > > <https://github.com/SELinuxProject/setools/blob/master/setools/policyrep/selinuxpolicy.pxi#L673> > > for the policy loading code. > > > > I think the fact that the CIL, kernel to CIL, kernel to conf, and > module to CIL code is all in libsepol speaks to the fact of how > tightly integrated they are to the rest of libsepol. One argument that > could be made is that the policyrep stuff in setools really belongs in > libsepol. > > Thinking about how libsepol could be encapsulated leads me to a couple > of possibilities. One way would be functions that could return lists > of rules. The policy module code gives us avrule_t, role_trans_rule_t, > role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. > Those structures are probably unlikely to change and, at least in this > case, creating a function that walks the filename_trans hashtable and > returns a list of filename_trans_rule_t certainly seems like it > wouldn't be too hard. Another possible way to encapsulate would be > create a way to walk the various hashtables element by element (I > think hashtab_map() requires too much knowledge of the internal > structures), returning an opaque structure to track where you are in > the hashtable and functions that allow you to get each part of the > rule being stored. There are other ways that it could be done, but I > could rewrite kernel to and module to stuff with either of those. CIL > itself would require some functions to insert rules into the policydb > which probably wouldn't be too hard. None of this would be too hard, > but it would take some time. The real question is would it really be > valuable?
On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: > I think the fact that the CIL, kernel to CIL, kernel to conf, and > module to CIL code is all in libsepol speaks to the fact of how > tightly integrated they are to the rest of libsepol. One argument that > could be made is that the policyrep stuff in setools really belongs in > libsepol. > > Thinking about how libsepol could be encapsulated leads me to a couple > of possibilities. One way would be functions that could return lists > of rules. The policy module code gives us avrule_t, role_trans_rule_t, > role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. > Those structures are probably unlikely to change and, at least in this > case, creating a function that walks the filename_trans hashtable and > returns a list of filename_trans_rule_t certainly seems like it > wouldn't be too hard. Another possible way to encapsulate would be > create a way to walk the various hashtables element by element (I > think hashtab_map() requires too much knowledge of the internal > structures), returning an opaque structure to track where you are in > the hashtable and functions that allow you to get each part of the > rule being stored. There are other ways that it could be done, but I > could rewrite kernel to and module to stuff with either of those. CIL > itself would require some functions to insert rules into the policydb > which probably wouldn't be too hard. None of this would be too hard, > but it would take some time. The real question is would it really be > valuable? I don't think we want to directly expose the existing data structures from include/sepol/policydb/*.h (or at least not without a careful audit) since those are often tightly coupled to policy compiler internals and/or the kernel or module policy formats. Creating an abstraction for each with a proper API in new definitions in include/sepol/*.h would be preferable albeit more work. There was a proposal a long time ago from the setools developers to create an iterator interface and accessor functions for each data type, see https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/.
On Thu, Apr 30, 2020 at 3:23 PM Stephen Smalley <stephen.smalley.work@gmail.com> wrote: > On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: > > I think the fact that the CIL, kernel to CIL, kernel to conf, and > > module to CIL code is all in libsepol speaks to the fact of how > > tightly integrated they are to the rest of libsepol. One argument that > > could be made is that the policyrep stuff in setools really belongs in > > libsepol. > > > > Thinking about how libsepol could be encapsulated leads me to a couple > > of possibilities. One way would be functions that could return lists > > of rules. The policy module code gives us avrule_t, role_trans_rule_t, > > role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. > > Those structures are probably unlikely to change and, at least in this > > case, creating a function that walks the filename_trans hashtable and > > returns a list of filename_trans_rule_t certainly seems like it > > wouldn't be too hard. Another possible way to encapsulate would be > > create a way to walk the various hashtables element by element (I > > think hashtab_map() requires too much knowledge of the internal > > structures), returning an opaque structure to track where you are in > > the hashtable and functions that allow you to get each part of the > > rule being stored. There are other ways that it could be done, but I > > could rewrite kernel to and module to stuff with either of those. CIL > > itself would require some functions to insert rules into the policydb > > which probably wouldn't be too hard. None of this would be too hard, > > but it would take some time. The real question is would it really be > > valuable? > > I don't think we want to directly expose the existing data structures > from include/sepol/policydb/*.h (or at least not without a careful > audit) since those are often tightly coupled to policy compiler > internals and/or the kernel or module policy formats. Creating an > abstraction for each with a proper API in new definitions in > include/sepol/*.h would be preferable albeit more work. There was a > proposal a long time ago from the setools developers to create an > iterator interface and accessor functions for each data type, see > https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/. We could also partially mitigate the problem by linking libsepol statically into setools at build time. I suggested this in [1] for Fedora at least, but in general there is the problem that you need to know the exact path to libsepol.a in setup.py [2]. I don't have enough experience with python libraries to know if this can be implemented directly in the upstream setup.py script in a reasonably generic way. If linked statically, at least it wouldn't segfault after an update to an incompatible libsepol.so. It would still fail to build with the new headers, but at least this turns from a user-visible bug to only packager's & maintainer's problem. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1826000 [2] https://stackoverflow.com/questions/4597228/how-to-statically-link-a-library-when-compiling-a-python-module-extension/49139257#49139257
On 4/30/20 9:22 AM, Stephen Smalley wrote: > On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: >> I think the fact that the CIL, kernel to CIL, kernel to conf, and >> module to CIL code is all in libsepol speaks to the fact of how >> tightly integrated they are to the rest of libsepol. One argument that >> could be made is that the policyrep stuff in setools really belongs in >> libsepol. >> >> Thinking about how libsepol could be encapsulated leads me to a couple >> of possibilities. One way would be functions that could return lists >> of rules. The policy module code gives us avrule_t, role_trans_rule_t, >> role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. >> Those structures are probably unlikely to change and, at least in this >> case, creating a function that walks the filename_trans hashtable and >> returns a list of filename_trans_rule_t certainly seems like it >> wouldn't be too hard. Another possible way to encapsulate would be >> create a way to walk the various hashtables element by element (I >> think hashtab_map() requires too much knowledge of the internal >> structures), returning an opaque structure to track where you are in >> the hashtable and functions that allow you to get each part of the >> rule being stored. There are other ways that it could be done, but I >> could rewrite kernel to and module to stuff with either of those. CIL >> itself would require some functions to insert rules into the policydb >> which probably wouldn't be too hard. None of this would be too hard, >> but it would take some time. The real question is would it really be >> valuable? > > I don't think we want to directly expose the existing data structures > from include/sepol/policydb/*.h (or at least not without a careful > audit) since those are often tightly coupled to policy compiler > internals and/or the kernel or module policy formats. Creating an > abstraction for each with a proper API in new definitions in > include/sepol/*.h would be preferable albeit more work. There was a > proposal a long time ago from the setools developers to create an > iterator interface and accessor functions for each data type, see > https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/. I agree. The hardest thing with writing the policyrep in setools was stuff like the value_to_datum indirections, type_attr_map, etc. and knowing when to use value vs value-1. An API that has a new set of structs would be ideal. Unfortunately, since setools policyrep is now written in Cython, we can't simply move the code over to libsepol. My guess is dispol has the most useful building blocks for making a new API.
On Thu, Apr 30, 2020 at 4:24 PM Chris PeBenito <pebenito@ieee.org> wrote: > On 4/30/20 9:22 AM, Stephen Smalley wrote: > > On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: > >> I think the fact that the CIL, kernel to CIL, kernel to conf, and > >> module to CIL code is all in libsepol speaks to the fact of how > >> tightly integrated they are to the rest of libsepol. One argument that > >> could be made is that the policyrep stuff in setools really belongs in > >> libsepol. > >> > >> Thinking about how libsepol could be encapsulated leads me to a couple > >> of possibilities. One way would be functions that could return lists > >> of rules. The policy module code gives us avrule_t, role_trans_rule_t, > >> role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. > >> Those structures are probably unlikely to change and, at least in this > >> case, creating a function that walks the filename_trans hashtable and > >> returns a list of filename_trans_rule_t certainly seems like it > >> wouldn't be too hard. Another possible way to encapsulate would be > >> create a way to walk the various hashtables element by element (I > >> think hashtab_map() requires too much knowledge of the internal > >> structures), returning an opaque structure to track where you are in > >> the hashtable and functions that allow you to get each part of the > >> rule being stored. There are other ways that it could be done, but I > >> could rewrite kernel to and module to stuff with either of those. CIL > >> itself would require some functions to insert rules into the policydb > >> which probably wouldn't be too hard. None of this would be too hard, > >> but it would take some time. The real question is would it really be > >> valuable? > > > > I don't think we want to directly expose the existing data structures > > from include/sepol/policydb/*.h (or at least not without a careful > > audit) since those are often tightly coupled to policy compiler > > internals and/or the kernel or module policy formats. Creating an > > abstraction for each with a proper API in new definitions in > > include/sepol/*.h would be preferable albeit more work. There was a > > proposal a long time ago from the setools developers to create an > > iterator interface and accessor functions for each data type, see > > https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/. > > I agree. The hardest thing with writing the policyrep in setools was stuff like > the value_to_datum indirections, type_attr_map, etc. and knowing when to use > value vs value-1. An API that has a new set of structs would be ideal. > > Unfortunately, since setools policyrep is now written in Cython, we can't simply > move the code over to libsepol. My guess is dispol has the most useful building > blocks for making a new API. Since you mention dispol... I also had the idea that setools could just use the existing public interface to convert the whole policydb to CIL and simply parse that as a string (this should be pretty straightforward even in pure Python). However, based on my experiments this would likely make setools a lot slower...
On 4/30/20 10:20 AM, Ondrej Mosnacek wrote: > On Thu, Apr 30, 2020 at 3:23 PM Stephen Smalley > <stephen.smalley.work@gmail.com> wrote: >> On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: >>> I think the fact that the CIL, kernel to CIL, kernel to conf, and >>> module to CIL code is all in libsepol speaks to the fact of how >>> tightly integrated they are to the rest of libsepol. One argument that >>> could be made is that the policyrep stuff in setools really belongs in >>> libsepol. >>> >>> Thinking about how libsepol could be encapsulated leads me to a couple >>> of possibilities. One way would be functions that could return lists >>> of rules. The policy module code gives us avrule_t, role_trans_rule_t, >>> role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. >>> Those structures are probably unlikely to change and, at least in this >>> case, creating a function that walks the filename_trans hashtable and >>> returns a list of filename_trans_rule_t certainly seems like it >>> wouldn't be too hard. Another possible way to encapsulate would be >>> create a way to walk the various hashtables element by element (I >>> think hashtab_map() requires too much knowledge of the internal >>> structures), returning an opaque structure to track where you are in >>> the hashtable and functions that allow you to get each part of the >>> rule being stored. There are other ways that it could be done, but I >>> could rewrite kernel to and module to stuff with either of those. CIL >>> itself would require some functions to insert rules into the policydb >>> which probably wouldn't be too hard. None of this would be too hard, >>> but it would take some time. The real question is would it really be >>> valuable? >> >> I don't think we want to directly expose the existing data structures >> from include/sepol/policydb/*.h (or at least not without a careful >> audit) since those are often tightly coupled to policy compiler >> internals and/or the kernel or module policy formats. Creating an >> abstraction for each with a proper API in new definitions in >> include/sepol/*.h would be preferable albeit more work. There was a >> proposal a long time ago from the setools developers to create an >> iterator interface and accessor functions for each data type, see >> https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/. > > We could also partially mitigate the problem by linking libsepol > statically into setools at build time. I suggested this in [1] for > Fedora at least, but in general there is the problem that you need to > know the exact path to libsepol.a in setup.py [2]. I don't have enough > experience with python libraries to know if this can be implemented > directly in the upstream setup.py script in a reasonably generic way. > > If linked statically, at least it wouldn't segfault after an update to > an incompatible libsepol.so. It would still fail to build with the new > headers, but at least this turns from a user-visible bug to only > packager's & maintainer's problem. It used to be static and people complained about that. I moved it to dynamic knowing this would be an issue, but an uncommon one.
On 4/30/20 10:34 AM, Ondrej Mosnacek wrote: > On Thu, Apr 30, 2020 at 4:24 PM Chris PeBenito <pebenito@ieee.org> wrote: >> On 4/30/20 9:22 AM, Stephen Smalley wrote: >>> On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: >>>> I think the fact that the CIL, kernel to CIL, kernel to conf, and >>>> module to CIL code is all in libsepol speaks to the fact of how >>>> tightly integrated they are to the rest of libsepol. One argument that >>>> could be made is that the policyrep stuff in setools really belongs in >>>> libsepol. >>>> >>>> Thinking about how libsepol could be encapsulated leads me to a couple >>>> of possibilities. One way would be functions that could return lists >>>> of rules. The policy module code gives us avrule_t, role_trans_rule_t, >>>> role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. >>>> Those structures are probably unlikely to change and, at least in this >>>> case, creating a function that walks the filename_trans hashtable and >>>> returns a list of filename_trans_rule_t certainly seems like it >>>> wouldn't be too hard. Another possible way to encapsulate would be >>>> create a way to walk the various hashtables element by element (I >>>> think hashtab_map() requires too much knowledge of the internal >>>> structures), returning an opaque structure to track where you are in >>>> the hashtable and functions that allow you to get each part of the >>>> rule being stored. There are other ways that it could be done, but I >>>> could rewrite kernel to and module to stuff with either of those. CIL >>>> itself would require some functions to insert rules into the policydb >>>> which probably wouldn't be too hard. None of this would be too hard, >>>> but it would take some time. The real question is would it really be >>>> valuable? >>> >>> I don't think we want to directly expose the existing data structures >>> from include/sepol/policydb/*.h (or at least not without a careful >>> audit) since those are often tightly coupled to policy compiler >>> internals and/or the kernel or module policy formats. Creating an >>> abstraction for each with a proper API in new definitions in >>> include/sepol/*.h would be preferable albeit more work. There was a >>> proposal a long time ago from the setools developers to create an >>> iterator interface and accessor functions for each data type, see >>> https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/. >> >> I agree. The hardest thing with writing the policyrep in setools was stuff like >> the value_to_datum indirections, type_attr_map, etc. and knowing when to use >> value vs value-1. An API that has a new set of structs would be ideal. >> >> Unfortunately, since setools policyrep is now written in Cython, we can't simply >> move the code over to libsepol. My guess is dispol has the most useful building >> blocks for making a new API. > > Since you mention dispol... I also had the idea that setools could > just use the existing public interface to convert the whole policydb > to CIL and simply parse that as a string (this should be pretty > straightforward even in pure Python). However, based on my experiments > this would likely make setools a lot slower... This is an interesting idea. I'm not familiar with the CIL API; secilc.c looks like it uses public API. Can I use the existing CIL library functions to parse it, or does the resultant db lack public accessor functions?
On Thu, Apr 30, 2020 at 9:23 AM Stephen Smalley <stephen.smalley.work@gmail.com> wrote: > > On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: > > I think the fact that the CIL, kernel to CIL, kernel to conf, and > > module to CIL code is all in libsepol speaks to the fact of how > > tightly integrated they are to the rest of libsepol. One argument that > > could be made is that the policyrep stuff in setools really belongs in > > libsepol. > > > > Thinking about how libsepol could be encapsulated leads me to a couple > > of possibilities. One way would be functions that could return lists > > of rules. The policy module code gives us avrule_t, role_trans_rule_t, > > role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. > > Those structures are probably unlikely to change and, at least in this > > case, creating a function that walks the filename_trans hashtable and > > returns a list of filename_trans_rule_t certainly seems like it > > wouldn't be too hard. Another possible way to encapsulate would be > > create a way to walk the various hashtables element by element (I > > think hashtab_map() requires too much knowledge of the internal > > structures), returning an opaque structure to track where you are in > > the hashtable and functions that allow you to get each part of the > > rule being stored. There are other ways that it could be done, but I > > could rewrite kernel to and module to stuff with either of those. CIL > > itself would require some functions to insert rules into the policydb > > which probably wouldn't be too hard. None of this would be too hard, > > but it would take some time. The real question is would it really be > > valuable? > > I don't think we want to directly expose the existing data structures > from include/sepol/policydb/*.h (or at least not without a careful > audit) since those are often tightly coupled to policy compiler > internals and/or the kernel or module policy formats. Creating an > abstraction for each with a proper API in new definitions in > include/sepol/*.h would be preferable albeit more work. There was a > proposal a long time ago from the setools developers to create an > iterator interface and accessor functions for each data type, see > https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/. I am so used to everything just using the stuff in include/sepol/policydb/ that I forget that it exists. There are a number of iterators there (for users, ports, nodes, etc), so it seems like creating an iterator and other functions for filename transition rules and converting setools to use that would be the first step. Over time other iterators can be created and setools converted to use those as well. Jim
On Thu, Apr 30, 2020 at 11:20 AM Chris PeBenito <pebenito@ieee.org> wrote: > > On 4/30/20 10:34 AM, Ondrej Mosnacek wrote: > > On Thu, Apr 30, 2020 at 4:24 PM Chris PeBenito <pebenito@ieee.org> wrote: > >> On 4/30/20 9:22 AM, Stephen Smalley wrote: > >>> On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: > >>>> I think the fact that the CIL, kernel to CIL, kernel to conf, and > >>>> module to CIL code is all in libsepol speaks to the fact of how > >>>> tightly integrated they are to the rest of libsepol. One argument that > >>>> could be made is that the policyrep stuff in setools really belongs in > >>>> libsepol. > >>>> > >>>> Thinking about how libsepol could be encapsulated leads me to a couple > >>>> of possibilities. One way would be functions that could return lists > >>>> of rules. The policy module code gives us avrule_t, role_trans_rule_t, > >>>> role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. > >>>> Those structures are probably unlikely to change and, at least in this > >>>> case, creating a function that walks the filename_trans hashtable and > >>>> returns a list of filename_trans_rule_t certainly seems like it > >>>> wouldn't be too hard. Another possible way to encapsulate would be > >>>> create a way to walk the various hashtables element by element (I > >>>> think hashtab_map() requires too much knowledge of the internal > >>>> structures), returning an opaque structure to track where you are in > >>>> the hashtable and functions that allow you to get each part of the > >>>> rule being stored. There are other ways that it could be done, but I > >>>> could rewrite kernel to and module to stuff with either of those. CIL > >>>> itself would require some functions to insert rules into the policydb > >>>> which probably wouldn't be too hard. None of this would be too hard, > >>>> but it would take some time. The real question is would it really be > >>>> valuable? > >>> > >>> I don't think we want to directly expose the existing data structures > >>> from include/sepol/policydb/*.h (or at least not without a careful > >>> audit) since those are often tightly coupled to policy compiler > >>> internals and/or the kernel or module policy formats. Creating an > >>> abstraction for each with a proper API in new definitions in > >>> include/sepol/*.h would be preferable albeit more work. There was a > >>> proposal a long time ago from the setools developers to create an > >>> iterator interface and accessor functions for each data type, see > >>> https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/. > >> > >> I agree. The hardest thing with writing the policyrep in setools was stuff like > >> the value_to_datum indirections, type_attr_map, etc. and knowing when to use > >> value vs value-1. An API that has a new set of structs would be ideal. > >> > >> Unfortunately, since setools policyrep is now written in Cython, we can't simply > >> move the code over to libsepol. My guess is dispol has the most useful building > >> blocks for making a new API. > > > > Since you mention dispol... I also had the idea that setools could > > just use the existing public interface to convert the whole policydb > > to CIL and simply parse that as a string (this should be pretty > > straightforward even in pure Python). However, based on my experiments > > this would likely make setools a lot slower... > > This is an interesting idea. I'm not familiar with the CIL API; secilc.c looks > like it uses public API. Can I use the existing CIL library functions to parse > it, or does the resultant db lack public accessor functions? > The resultant db does, in fact, lack public assessor functions. They could be created, but since there is already a way to convert the cil_db to a policydb, the general solution would be to create the functions for the policydb. Jim
On Thu, Apr 30, 2020 at 5:20 PM Chris PeBenito <pebenito@ieee.org> wrote: > On 4/30/20 10:34 AM, Ondrej Mosnacek wrote: > > On Thu, Apr 30, 2020 at 4:24 PM Chris PeBenito <pebenito@ieee.org> wrote: > >> On 4/30/20 9:22 AM, Stephen Smalley wrote: > >>> On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@gmail.com> wrote: > >>>> I think the fact that the CIL, kernel to CIL, kernel to conf, and > >>>> module to CIL code is all in libsepol speaks to the fact of how > >>>> tightly integrated they are to the rest of libsepol. One argument that > >>>> could be made is that the policyrep stuff in setools really belongs in > >>>> libsepol. > >>>> > >>>> Thinking about how libsepol could be encapsulated leads me to a couple > >>>> of possibilities. One way would be functions that could return lists > >>>> of rules. The policy module code gives us avrule_t, role_trans_rule_t, > >>>> role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. > >>>> Those structures are probably unlikely to change and, at least in this > >>>> case, creating a function that walks the filename_trans hashtable and > >>>> returns a list of filename_trans_rule_t certainly seems like it > >>>> wouldn't be too hard. Another possible way to encapsulate would be > >>>> create a way to walk the various hashtables element by element (I > >>>> think hashtab_map() requires too much knowledge of the internal > >>>> structures), returning an opaque structure to track where you are in > >>>> the hashtable and functions that allow you to get each part of the > >>>> rule being stored. There are other ways that it could be done, but I > >>>> could rewrite kernel to and module to stuff with either of those. CIL > >>>> itself would require some functions to insert rules into the policydb > >>>> which probably wouldn't be too hard. None of this would be too hard, > >>>> but it would take some time. The real question is would it really be > >>>> valuable? > >>> > >>> I don't think we want to directly expose the existing data structures > >>> from include/sepol/policydb/*.h (or at least not without a careful > >>> audit) since those are often tightly coupled to policy compiler > >>> internals and/or the kernel or module policy formats. Creating an > >>> abstraction for each with a proper API in new definitions in > >>> include/sepol/*.h would be preferable albeit more work. There was a > >>> proposal a long time ago from the setools developers to create an > >>> iterator interface and accessor functions for each data type, see > >>> https://lore.kernel.org/selinux/200603212246.k2LMkRNq028071@gotham.columbia.tresys.com/. > >> > >> I agree. The hardest thing with writing the policyrep in setools was stuff like > >> the value_to_datum indirections, type_attr_map, etc. and knowing when to use > >> value vs value-1. An API that has a new set of structs would be ideal. > >> > >> Unfortunately, since setools policyrep is now written in Cython, we can't simply > >> move the code over to libsepol. My guess is dispol has the most useful building > >> blocks for making a new API. > > > > Since you mention dispol... I also had the idea that setools could > > just use the existing public interface to convert the whole policydb > > to CIL and simply parse that as a string (this should be pretty > > straightforward even in pure Python). However, based on my experiments > > this would likely make setools a lot slower... > > This is an interesting idea. I'm not familiar with the CIL API; secilc.c looks > like it uses public API. Can I use the existing CIL library functions to parse > it, or does the resultant db lack public accessor functions? What I had in mind was actually to just use sepol_kernel_policydb_to_cil() to dump the textual CIL into a temporary file (maybe using tmpfile(3)), and then parse the contents in Python. The CIL format is really easy to parse (especially in Python) so a lack of existing functions for that wouldn't be much of an issue. Yes, this would be a little dirty, but you'd avoid the trouble of maintaining a stable binary interface between libsepol and setools.