Message ID | 9b77124b-675d-5ac7-3741-edec575bd425@linux.intel.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Headers | show |
Series | Introduce CAP_PERFMON to secure system performance monitoring and observability | expand |
On 1/20/20 6:23 AM, Alexey Budankov wrote: > > Introduce CAP_PERFMON capability designed to secure system performance > monitoring and observability operations so that CAP_PERFMON would assist > CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf > and other performance monitoring and observability subsystems. > > CAP_PERFMON intends to harden system security and integrity during system > performance monitoring and observability operations by decreasing attack > surface that is available to a CAP_SYS_ADMIN privileged process [1]. > Providing access to system performance monitoring and observability > operations under CAP_PERFMON capability singly, without the rest of > CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and > makes operation more secure. > > CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to > system performance monitoring and observability operations and balance > amount of CAP_SYS_ADMIN credentials following the recommendations in the > capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is > overloaded; see Notes to kernel developers, below." > > Although the software running under CAP_PERFMON can not ensure avoidance > of related hardware issues, the software can still mitigate these issues > following the official embargoed hardware issues mitigation procedure [2]. > The bugs in the software itself could be fixed following the standard > kernel development process [3] to maintain and harden security of system > performance monitoring and observability operations. > > [1] http://man7.org/linux/man-pages/man7/capabilities.7.html > [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html > [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html > > Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> > --- > include/linux/capability.h | 12 ++++++++++++ > include/uapi/linux/capability.h | 8 +++++++- > security/selinux/include/classmap.h | 4 ++-- > 3 files changed, 21 insertions(+), 3 deletions(-) > > diff --git a/include/linux/capability.h b/include/linux/capability.h > index ecce0f43c73a..8784969d91e1 100644 > --- a/include/linux/capability.h > +++ b/include/linux/capability.h > @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct > extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); > extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); > extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); > +static inline bool perfmon_capable(void) > +{ > + struct user_namespace *ns = &init_user_ns; > + > + if (ns_capable_noaudit(ns, CAP_PERFMON)) > + return ns_capable(ns, CAP_PERFMON); > + > + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) > + return ns_capable(ns, CAP_SYS_ADMIN); > + > + return false; > +} Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.
On 21.01.2020 17:43, Stephen Smalley wrote: > On 1/20/20 6:23 AM, Alexey Budankov wrote: >> >> Introduce CAP_PERFMON capability designed to secure system performance >> monitoring and observability operations so that CAP_PERFMON would assist >> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf >> and other performance monitoring and observability subsystems. >> >> CAP_PERFMON intends to harden system security and integrity during system >> performance monitoring and observability operations by decreasing attack >> surface that is available to a CAP_SYS_ADMIN privileged process [1]. >> Providing access to system performance monitoring and observability >> operations under CAP_PERFMON capability singly, without the rest of >> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and >> makes operation more secure. >> >> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to >> system performance monitoring and observability operations and balance >> amount of CAP_SYS_ADMIN credentials following the recommendations in the >> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is >> overloaded; see Notes to kernel developers, below." >> >> Although the software running under CAP_PERFMON can not ensure avoidance >> of related hardware issues, the software can still mitigate these issues >> following the official embargoed hardware issues mitigation procedure [2]. >> The bugs in the software itself could be fixed following the standard >> kernel development process [3] to maintain and harden security of system >> performance monitoring and observability operations. >> >> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html >> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html >> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html >> >> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> >> --- >> include/linux/capability.h | 12 ++++++++++++ >> include/uapi/linux/capability.h | 8 +++++++- >> security/selinux/include/classmap.h | 4 ++-- >> 3 files changed, 21 insertions(+), 3 deletions(-) >> >> diff --git a/include/linux/capability.h b/include/linux/capability.h >> index ecce0f43c73a..8784969d91e1 100644 >> --- a/include/linux/capability.h >> +++ b/include/linux/capability.h >> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct >> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); >> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); >> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); >> +static inline bool perfmon_capable(void) >> +{ >> + struct user_namespace *ns = &init_user_ns; >> + >> + if (ns_capable_noaudit(ns, CAP_PERFMON)) >> + return ns_capable(ns, CAP_PERFMON); >> + >> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) >> + return ns_capable(ns, CAP_SYS_ADMIN); >> + >> + return false; >> +} > > Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. Some of ideas from v4 review. Well, on the second sight, it defenitly should be logged for CAP_SYS_ADMIN. Probably it is not so fatal for CAP_PERFMON, but personally I would unconditionally log it for CAP_PERFMON as well. Good catch, thank you. ~Alexey
On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov <alexey.budankov@linux.intel.com> wrote: > > > On 21.01.2020 17:43, Stephen Smalley wrote: > > On 1/20/20 6:23 AM, Alexey Budankov wrote: > >> > >> Introduce CAP_PERFMON capability designed to secure system performance > >> monitoring and observability operations so that CAP_PERFMON would assist > >> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf > >> and other performance monitoring and observability subsystems. > >> > >> CAP_PERFMON intends to harden system security and integrity during system > >> performance monitoring and observability operations by decreasing attack > >> surface that is available to a CAP_SYS_ADMIN privileged process [1]. > >> Providing access to system performance monitoring and observability > >> operations under CAP_PERFMON capability singly, without the rest of > >> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and > >> makes operation more secure. > >> > >> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to > >> system performance monitoring and observability operations and balance > >> amount of CAP_SYS_ADMIN credentials following the recommendations in the > >> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is > >> overloaded; see Notes to kernel developers, below." > >> > >> Although the software running under CAP_PERFMON can not ensure avoidance > >> of related hardware issues, the software can still mitigate these issues > >> following the official embargoed hardware issues mitigation procedure [2]. > >> The bugs in the software itself could be fixed following the standard > >> kernel development process [3] to maintain and harden security of system > >> performance monitoring and observability operations. > >> > >> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html > >> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html > >> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html > >> > >> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> > >> --- > >> include/linux/capability.h | 12 ++++++++++++ > >> include/uapi/linux/capability.h | 8 +++++++- > >> security/selinux/include/classmap.h | 4 ++-- > >> 3 files changed, 21 insertions(+), 3 deletions(-) > >> > >> diff --git a/include/linux/capability.h b/include/linux/capability.h > >> index ecce0f43c73a..8784969d91e1 100644 > >> --- a/include/linux/capability.h > >> +++ b/include/linux/capability.h > >> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct > >> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); > >> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); > >> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); > >> +static inline bool perfmon_capable(void) > >> +{ > >> + struct user_namespace *ns = &init_user_ns; > >> + > >> + if (ns_capable_noaudit(ns, CAP_PERFMON)) > >> + return ns_capable(ns, CAP_PERFMON); > >> + > >> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) > >> + return ns_capable(ns, CAP_SYS_ADMIN); > >> + > >> + return false; > >> +} > > > > Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. > > Some of ideas from v4 review. well, in the requested changes form v4 I wrote: return capable(CAP_PERFMON); instead of return false; That's what Andy suggested earlier for CAP_BPF. I think that should resolve Stephen's concern.
On 21.01.2020 20:55, Alexei Starovoitov wrote: > On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov > <alexey.budankov@linux.intel.com> wrote: >> >> >> On 21.01.2020 17:43, Stephen Smalley wrote: >>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>> >>>> Introduce CAP_PERFMON capability designed to secure system performance >>>> monitoring and observability operations so that CAP_PERFMON would assist >>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf >>>> and other performance monitoring and observability subsystems. >>>> >>>> CAP_PERFMON intends to harden system security and integrity during system >>>> performance monitoring and observability operations by decreasing attack >>>> surface that is available to a CAP_SYS_ADMIN privileged process [1]. >>>> Providing access to system performance monitoring and observability >>>> operations under CAP_PERFMON capability singly, without the rest of >>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and >>>> makes operation more secure. >>>> >>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to >>>> system performance monitoring and observability operations and balance >>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the >>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is >>>> overloaded; see Notes to kernel developers, below." >>>> >>>> Although the software running under CAP_PERFMON can not ensure avoidance >>>> of related hardware issues, the software can still mitigate these issues >>>> following the official embargoed hardware issues mitigation procedure [2]. >>>> The bugs in the software itself could be fixed following the standard >>>> kernel development process [3] to maintain and harden security of system >>>> performance monitoring and observability operations. >>>> >>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html >>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html >>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html >>>> >>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> >>>> --- >>>> include/linux/capability.h | 12 ++++++++++++ >>>> include/uapi/linux/capability.h | 8 +++++++- >>>> security/selinux/include/classmap.h | 4 ++-- >>>> 3 files changed, 21 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/include/linux/capability.h b/include/linux/capability.h >>>> index ecce0f43c73a..8784969d91e1 100644 >>>> --- a/include/linux/capability.h >>>> +++ b/include/linux/capability.h >>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct >>>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); >>>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); >>>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); >>>> +static inline bool perfmon_capable(void) >>>> +{ >>>> + struct user_namespace *ns = &init_user_ns; >>>> + >>>> + if (ns_capable_noaudit(ns, CAP_PERFMON)) >>>> + return ns_capable(ns, CAP_PERFMON); >>>> + >>>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) >>>> + return ns_capable(ns, CAP_SYS_ADMIN); >>>> + >>>> + return false; >>>> +} >>> >>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >> >> Some of ideas from v4 review. > > well, in the requested changes form v4 I wrote: > return capable(CAP_PERFMON); > instead of > return false; Aww, indeed. I was concerning exactly about it when updating the patch and simply put false, missing the fact that capable() also logs. I suppose the idea is originally from here [1]. BTW, Has it already seen any _more optimal_ implementation? Anyway, original or optimized version could be reused for CAP_PERFMON. ~Alexey [1] https://patchwork.ozlabs.org/patch/1159243/ > > That's what Andy suggested earlier for CAP_BPF. > I think that should resolve Stephen's concern. >
On 21.01.2020 21:27, Alexey Budankov wrote: > > On 21.01.2020 20:55, Alexei Starovoitov wrote: >> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >> <alexey.budankov@linux.intel.com> wrote: >>> >>> >>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>> >>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>> monitoring and observability operations so that CAP_PERFMON would assist >>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf >>>>> and other performance monitoring and observability subsystems. >>>>> >>>>> CAP_PERFMON intends to harden system security and integrity during system >>>>> performance monitoring and observability operations by decreasing attack >>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1]. >>>>> Providing access to system performance monitoring and observability >>>>> operations under CAP_PERFMON capability singly, without the rest of >>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and >>>>> makes operation more secure. >>>>> >>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to >>>>> system performance monitoring and observability operations and balance >>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the >>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is >>>>> overloaded; see Notes to kernel developers, below." >>>>> >>>>> Although the software running under CAP_PERFMON can not ensure avoidance >>>>> of related hardware issues, the software can still mitigate these issues >>>>> following the official embargoed hardware issues mitigation procedure [2]. >>>>> The bugs in the software itself could be fixed following the standard >>>>> kernel development process [3] to maintain and harden security of system >>>>> performance monitoring and observability operations. >>>>> >>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html >>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html >>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html >>>>> >>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> >>>>> --- >>>>> include/linux/capability.h | 12 ++++++++++++ >>>>> include/uapi/linux/capability.h | 8 +++++++- >>>>> security/selinux/include/classmap.h | 4 ++-- >>>>> 3 files changed, 21 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h >>>>> index ecce0f43c73a..8784969d91e1 100644 >>>>> --- a/include/linux/capability.h >>>>> +++ b/include/linux/capability.h >>>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct >>>>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); >>>>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); >>>>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); >>>>> +static inline bool perfmon_capable(void) >>>>> +{ >>>>> + struct user_namespace *ns = &init_user_ns; >>>>> + >>>>> + if (ns_capable_noaudit(ns, CAP_PERFMON)) >>>>> + return ns_capable(ns, CAP_PERFMON); >>>>> + >>>>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) >>>>> + return ns_capable(ns, CAP_SYS_ADMIN); >>>>> + >>>>> + return false; >>>>> +} >>>> >>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. So far so good, I suggest using the simplest version for v6: static inline bool perfmon_capable(void) { return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); } It keeps the implementation simple and readable. The implementation is more performant in the sense of calling the API - one capable() call for CAP_PERFMON privileged process. Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, but this bloating also advertises and leverages using more secure CAP_PERFMON based approach to use perf_event_open system call. ~Alexey >>> >>> Some of ideas from v4 review. >> >> well, in the requested changes form v4 I wrote: >> return capable(CAP_PERFMON); >> instead of >> return false; > > Aww, indeed. I was concerning exactly about it when updating the patch > and simply put false, missing the fact that capable() also logs. > > I suppose the idea is originally from here [1]. > BTW, Has it already seen any _more optimal_ implementation? > Anyway, original or optimized version could be reused for CAP_PERFMON. > > ~Alexey > > [1] https://patchwork.ozlabs.org/patch/1159243/ > >> >> That's what Andy suggested earlier for CAP_BPF. >> I think that should resolve Stephen's concern. >>
On 1/22/20 5:45 AM, Alexey Budankov wrote: > > On 21.01.2020 21:27, Alexey Budankov wrote: >> >> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>> <alexey.budankov@linux.intel.com> wrote: >>>> >>>> >>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>> >>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>> monitoring and observability operations so that CAP_PERFMON would assist >>>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf >>>>>> and other performance monitoring and observability subsystems. >>>>>> >>>>>> CAP_PERFMON intends to harden system security and integrity during system >>>>>> performance monitoring and observability operations by decreasing attack >>>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1]. >>>>>> Providing access to system performance monitoring and observability >>>>>> operations under CAP_PERFMON capability singly, without the rest of >>>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and >>>>>> makes operation more secure. >>>>>> >>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to >>>>>> system performance monitoring and observability operations and balance >>>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the >>>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is >>>>>> overloaded; see Notes to kernel developers, below." >>>>>> >>>>>> Although the software running under CAP_PERFMON can not ensure avoidance >>>>>> of related hardware issues, the software can still mitigate these issues >>>>>> following the official embargoed hardware issues mitigation procedure [2]. >>>>>> The bugs in the software itself could be fixed following the standard >>>>>> kernel development process [3] to maintain and harden security of system >>>>>> performance monitoring and observability operations. >>>>>> >>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html >>>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html >>>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html >>>>>> >>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> >>>>>> --- >>>>>> include/linux/capability.h | 12 ++++++++++++ >>>>>> include/uapi/linux/capability.h | 8 +++++++- >>>>>> security/selinux/include/classmap.h | 4 ++-- >>>>>> 3 files changed, 21 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h >>>>>> index ecce0f43c73a..8784969d91e1 100644 >>>>>> --- a/include/linux/capability.h >>>>>> +++ b/include/linux/capability.h >>>>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct >>>>>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); >>>>>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); >>>>>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); >>>>>> +static inline bool perfmon_capable(void) >>>>>> +{ >>>>>> + struct user_namespace *ns = &init_user_ns; >>>>>> + >>>>>> + if (ns_capable_noaudit(ns, CAP_PERFMON)) >>>>>> + return ns_capable(ns, CAP_PERFMON); >>>>>> + >>>>>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) >>>>>> + return ns_capable(ns, CAP_SYS_ADMIN); >>>>>> + >>>>>> + return false; >>>>>> +} >>>>> >>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. > > So far so good, I suggest using the simplest version for v6: > > static inline bool perfmon_capable(void) > { > return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); > } > > It keeps the implementation simple and readable. The implementation is more > performant in the sense of calling the API - one capable() call for CAP_PERFMON > privileged process. > > Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, > but this bloating also advertises and leverages using more secure CAP_PERFMON > based approach to use perf_event_open system call. I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.
On 22.01.2020 17:07, Stephen Smalley wrote: > On 1/22/20 5:45 AM, Alexey Budankov wrote: >> >> On 21.01.2020 21:27, Alexey Budankov wrote: >>> >>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>> <alexey.budankov@linux.intel.com> wrote: >>>>> >>>>> >>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>> >>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>> monitoring and observability operations so that CAP_PERFMON would assist >>>>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf >>>>>>> and other performance monitoring and observability subsystems. >>>>>>> >>>>>>> CAP_PERFMON intends to harden system security and integrity during system >>>>>>> performance monitoring and observability operations by decreasing attack >>>>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1]. >>>>>>> Providing access to system performance monitoring and observability >>>>>>> operations under CAP_PERFMON capability singly, without the rest of >>>>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and >>>>>>> makes operation more secure. >>>>>>> >>>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to >>>>>>> system performance monitoring and observability operations and balance >>>>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the >>>>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is >>>>>>> overloaded; see Notes to kernel developers, below." >>>>>>> >>>>>>> Although the software running under CAP_PERFMON can not ensure avoidance >>>>>>> of related hardware issues, the software can still mitigate these issues >>>>>>> following the official embargoed hardware issues mitigation procedure [2]. >>>>>>> The bugs in the software itself could be fixed following the standard >>>>>>> kernel development process [3] to maintain and harden security of system >>>>>>> performance monitoring and observability operations. >>>>>>> >>>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html >>>>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html >>>>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html >>>>>>> >>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> >>>>>>> --- >>>>>>> include/linux/capability.h | 12 ++++++++++++ >>>>>>> include/uapi/linux/capability.h | 8 +++++++- >>>>>>> security/selinux/include/classmap.h | 4 ++-- >>>>>>> 3 files changed, 21 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h >>>>>>> index ecce0f43c73a..8784969d91e1 100644 >>>>>>> --- a/include/linux/capability.h >>>>>>> +++ b/include/linux/capability.h >>>>>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct >>>>>>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); >>>>>>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); >>>>>>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); >>>>>>> +static inline bool perfmon_capable(void) >>>>>>> +{ >>>>>>> + struct user_namespace *ns = &init_user_ns; >>>>>>> + >>>>>>> + if (ns_capable_noaudit(ns, CAP_PERFMON)) >>>>>>> + return ns_capable(ns, CAP_PERFMON); >>>>>>> + >>>>>>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) >>>>>>> + return ns_capable(ns, CAP_SYS_ADMIN); >>>>>>> + >>>>>>> + return false; >>>>>>> +} >>>>>> >>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >> >> So far so good, I suggest using the simplest version for v6: >> >> static inline bool perfmon_capable(void) >> { >> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >> } >> >> It keeps the implementation simple and readable. The implementation is more >> performant in the sense of calling the API - one capable() call for CAP_PERFMON >> privileged process. >> >> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >> but this bloating also advertises and leverages using more secure CAP_PERFMON >> based approach to use perf_event_open system call. > > I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. perf security [1] document can be updated, at least, to align and document this audit logging specifics. ~Alexey [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
On 22.01.2020 17:25, Alexey Budankov wrote: > > On 22.01.2020 17:07, Stephen Smalley wrote: >> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>> >>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>> >>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>> >>>>>> >>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>> >>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>>> monitoring and observability operations so that CAP_PERFMON would assist >>>>>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf >>>>>>>> and other performance monitoring and observability subsystems. >>>>>>>> >>>>>>>> CAP_PERFMON intends to harden system security and integrity during system >>>>>>>> performance monitoring and observability operations by decreasing attack >>>>>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1]. >>>>>>>> Providing access to system performance monitoring and observability >>>>>>>> operations under CAP_PERFMON capability singly, without the rest of >>>>>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and >>>>>>>> makes operation more secure. >>>>>>>> >>>>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to >>>>>>>> system performance monitoring and observability operations and balance >>>>>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the >>>>>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is >>>>>>>> overloaded; see Notes to kernel developers, below." >>>>>>>> >>>>>>>> Although the software running under CAP_PERFMON can not ensure avoidance >>>>>>>> of related hardware issues, the software can still mitigate these issues >>>>>>>> following the official embargoed hardware issues mitigation procedure [2]. >>>>>>>> The bugs in the software itself could be fixed following the standard >>>>>>>> kernel development process [3] to maintain and harden security of system >>>>>>>> performance monitoring and observability operations. >>>>>>>> >>>>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html >>>>>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html >>>>>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html <SNIP> >>>>>>>> >>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> >>>>>>> >>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >>> >>> So far so good, I suggest using the simplest version for v6: >>> >>> static inline bool perfmon_capable(void) >>> { >>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>> } >>> >>> It keeps the implementation simple and readable. The implementation is more >>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>> privileged process. >>> >>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>> based approach to use perf_event_open system call. >> >> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. > > perf security [1] document can be updated, at least, to align and document > this audit logging specifics. And I plan to update the document right after this patch set is accepted. Feel free to let me know of the places in the kernel docs that also require update w.r.t CAP_PERFMON extension. ~Alexey > > ~Alexey > > [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html >
Alexey Budankov <alexey.budankov@linux.intel.com> writes: > On 22.01.2020 17:25, Alexey Budankov wrote: >> On 22.01.2020 17:07, Stephen Smalley wrote: >>>> It keeps the implementation simple and readable. The implementation is more >>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>> privileged process. >>>> >>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>> based approach to use perf_event_open system call. >>> >>> I can live with that. We just need to document that when you see >>> both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, >>> try only allowing CAP_PERFMON first and see if that resolves the >>> issue. We have a similar issue with CAP_DAC_READ_SEARCH versus >>> CAP_DAC_OVERRIDE. >> >> perf security [1] document can be updated, at least, to align and document >> this audit logging specifics. > > And I plan to update the document right after this patch set is accepted. > Feel free to let me know of the places in the kernel docs that also > require update w.r.t CAP_PERFMON extension. The documentation update wants be part of the patch set and not planned to be done _after_ the patch set is merged. Thanks, tglx
On 07.02.2020 14:38, Thomas Gleixner wrote: > Alexey Budankov <alexey.budankov@linux.intel.com> writes: >> On 22.01.2020 17:25, Alexey Budankov wrote: >>> On 22.01.2020 17:07, Stephen Smalley wrote: >>>>> It keeps the implementation simple and readable. The implementation is more >>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>>> privileged process. >>>>> >>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>>> based approach to use perf_event_open system call. >>>> >>>> I can live with that. We just need to document that when you see >>>> both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, >>>> try only allowing CAP_PERFMON first and see if that resolves the >>>> issue. We have a similar issue with CAP_DAC_READ_SEARCH versus >>>> CAP_DAC_OVERRIDE. >>> >>> perf security [1] document can be updated, at least, to align and document >>> this audit logging specifics. >> >> And I plan to update the document right after this patch set is accepted. >> Feel free to let me know of the places in the kernel docs that also >> require update w.r.t CAP_PERFMON extension. > > The documentation update wants be part of the patch set and not planned > to be done _after_ the patch set is merged. Well, accepted. It is going to make patches #11 and beyond. Thanks, Alexey > > Thanks, > > tglx >
Hi Stephen, On 22.01.2020 17:07, Stephen Smalley wrote: > On 1/22/20 5:45 AM, Alexey Budankov wrote: >> >> On 21.01.2020 21:27, Alexey Budankov wrote: >>> >>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>> <alexey.budankov@linux.intel.com> wrote: >>>>> >>>>> >>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>> <SNIP> >>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>> >>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >> >> So far so good, I suggest using the simplest version for v6: >> >> static inline bool perfmon_capable(void) >> { >> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >> } >> >> It keeps the implementation simple and readable. The implementation is more >> performant in the sense of calling the API - one capable() call for CAP_PERFMON >> privileged process. >> >> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >> but this bloating also advertises and leverages using more secure CAP_PERFMON >> based approach to use perf_event_open system call. > > I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. I am trying to reproduce this double logging with CAP_PERFMON. I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode. When running perf stat -a I am observing this AVC audit messages: type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 However there is no capability related messages around. I suppose my refpolicy should be modified somehow to observe capability related AVCs. Could you please comment or clarify on how to enable caps related AVCs in order to test the concerned logging. Thanks, Alexey --- [1] https://github.com/SELinuxProject/refpolicy.git
On 2/12/20 3:53 AM, Alexey Budankov wrote: > Hi Stephen, > > On 22.01.2020 17:07, Stephen Smalley wrote: >> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>> >>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>> >>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>> >>>>>> >>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>> > <SNIP> >>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>> >>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >>> >>> So far so good, I suggest using the simplest version for v6: >>> >>> static inline bool perfmon_capable(void) >>> { >>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>> } >>> >>> It keeps the implementation simple and readable. The implementation is more >>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>> privileged process. >>> >>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>> based approach to use perf_event_open system call. >> >> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. > > I am trying to reproduce this double logging with CAP_PERFMON. > I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode. > When running perf stat -a I am observing this AVC audit messages: > > type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 > type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 > type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 > type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 > > However there is no capability related messages around. I suppose my refpolicy should > be modified somehow to observe capability related AVCs. > > Could you please comment or clarify on how to enable caps related AVCs in order > to test the concerned logging. The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.
On 12.02.2020 16:32, Stephen Smalley wrote: > On 2/12/20 3:53 AM, Alexey Budankov wrote: >> Hi Stephen, >> >> On 22.01.2020 17:07, Stephen Smalley wrote: >>> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>>> >>>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>>> >>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>>> >>>>>>> >>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>>> >> <SNIP> >>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>>> >>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >>>> >>>> So far so good, I suggest using the simplest version for v6: >>>> >>>> static inline bool perfmon_capable(void) >>>> { >>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>>> } >>>> >>>> It keeps the implementation simple and readable. The implementation is more >>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>> privileged process. >>>> >>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>> based approach to use perf_event_open system call. >>> >>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. >> >> I am trying to reproduce this double logging with CAP_PERFMON. >> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode. >> When running perf stat -a I am observing this AVC audit messages: >> >> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >> >> However there is no capability related messages around. I suppose my refpolicy should >> be modified somehow to observe capability related AVCs. >> >> Could you please comment or clarify on how to enable caps related AVCs in order >> to test the concerned logging. > > The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module. Yes, I already have it like this: common cap2 { <------>mac_override<--># unused by SELinux <------>mac_admin <------>syslog <------>wake_alarm <------>block_suspend <------>audit_read <------>perfmon } dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials. BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it. ~Alexey > >
On 2/12/20 8:53 AM, Alexey Budankov wrote: > On 12.02.2020 16:32, Stephen Smalley wrote: >> On 2/12/20 3:53 AM, Alexey Budankov wrote: >>> Hi Stephen, >>> >>> On 22.01.2020 17:07, Stephen Smalley wrote: >>>> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>>>> >>>>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>>>> >>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>>>> >>> <SNIP> >>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>>>> >>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >>>>> >>>>> So far so good, I suggest using the simplest version for v6: >>>>> >>>>> static inline bool perfmon_capable(void) >>>>> { >>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>>>> } >>>>> >>>>> It keeps the implementation simple and readable. The implementation is more >>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>>> privileged process. >>>>> >>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>>> based approach to use perf_event_open system call. >>>> >>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. >>> >>> I am trying to reproduce this double logging with CAP_PERFMON. >>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode. >>> When running perf stat -a I am observing this AVC audit messages: >>> >>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>> >>> However there is no capability related messages around. I suppose my refpolicy should >>> be modified somehow to observe capability related AVCs. >>> >>> Could you please comment or clarify on how to enable caps related AVCs in order >>> to test the concerned logging. >> >> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module. > > Yes, I already have it like this: > common cap2 > { > <------>mac_override<--># unused by SELinux > <------>mac_admin > <------>syslog > <------>wake_alarm > <------>block_suspend > <------>audit_read > <------>perfmon > } > > dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials. > BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it. Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module.
On 2/12/20 10:21 AM, Stephen Smalley wrote: > On 2/12/20 8:53 AM, Alexey Budankov wrote: >> On 12.02.2020 16:32, Stephen Smalley wrote: >>> On 2/12/20 3:53 AM, Alexey Budankov wrote: >>>> Hi Stephen, >>>> >>>> On 22.01.2020 17:07, Stephen Smalley wrote: >>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>>>>> >>>>>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>>>>> >>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>>>>> >>>> <SNIP> >>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system >>>>>>>>>>> performance >>>>>>>>>> >>>>>>>>>> Why _noaudit()? Normally only used when a permission failure >>>>>>>>>> is non-fatal to the operation. Otherwise, we want the audit >>>>>>>>>> message. >>>>>> >>>>>> So far so good, I suggest using the simplest version for v6: >>>>>> >>>>>> static inline bool perfmon_capable(void) >>>>>> { >>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>>>>> } >>>>>> >>>>>> It keeps the implementation simple and readable. The >>>>>> implementation is more >>>>>> performant in the sense of calling the API - one capable() call >>>>>> for CAP_PERFMON >>>>>> privileged process. >>>>>> >>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and >>>>>> unprivileged processes, >>>>>> but this bloating also advertises and leverages using more secure >>>>>> CAP_PERFMON >>>>>> based approach to use perf_event_open system call. >>>>> >>>>> I can live with that. We just need to document that when you see >>>>> both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, >>>>> try only allowing CAP_PERFMON first and see if that resolves the >>>>> issue. We have a similar issue with CAP_DAC_READ_SEARCH versus >>>>> CAP_DAC_OVERRIDE. >>>> >>>> I am trying to reproduce this double logging with CAP_PERFMON. >>>> I am using the refpolicy version with enabled perf_event tclass [1], >>>> in permissive mode. >>>> When running perf stat -a I am observing this AVC audit messages: >>>> >>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for >>>> pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t >>>> tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } >>>> for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t >>>> tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for >>>> pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t >>>> tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } >>>> for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t >>>> tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>> >>>> However there is no capability related messages around. I suppose my >>>> refpolicy should >>>> be modified somehow to observe capability related AVCs. >>>> >>>> Could you please comment or clarify on how to enable caps related >>>> AVCs in order >>>> to test the concerned logging. >>> >>> The new perfmon permission has to be defined in your policy; you'll >>> have a message in dmesg about "Permission perfmon in class >>> capability2 not defined in policy.". You can either add it to the >>> common cap2 definition in refpolicy/policy/flask/access_vectors and >>> rebuild your policy or extract your base module as CIL, add it there, >>> and insert the updated module. >> >> Yes, I already have it like this: >> common cap2 >> { >> <------>mac_override<--># unused by SELinux >> <------>mac_admin >> <------>syslog >> <------>wake_alarm >> <------>block_suspend >> <------>audit_read >> <------>perfmon >> } >> >> dmesg stopped reporting perfmon as not defined but audit.log still >> doesn't report CAP_PERFMON denials. >> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however >> perfmon_capable() does check for it. > > Some denials may be silenced by dontaudit rules; semodule -DB will strip > those and semodule -B will restore them. Other possibility is that the > process doesn't have CAP_PERFMON in its effective set and therefore > never reaches SELinux at all; denied first by the capability module. Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that).
On 12.02.2020 18:21, Stephen Smalley wrote: > On 2/12/20 8:53 AM, Alexey Budankov wrote: >> On 12.02.2020 16:32, Stephen Smalley wrote: >>> On 2/12/20 3:53 AM, Alexey Budankov wrote: >>>> Hi Stephen, >>>> >>>> On 22.01.2020 17:07, Stephen Smalley wrote: >>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>>>>> >>>>>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>>>>> >>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>>>>> >>>> <SNIP> >>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>>>>> >>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >>>>>> >>>>>> So far so good, I suggest using the simplest version for v6: >>>>>> >>>>>> static inline bool perfmon_capable(void) >>>>>> { >>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>>>>> } >>>>>> >>>>>> It keeps the implementation simple and readable. The implementation is more >>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>>>> privileged process. >>>>>> >>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>>>> based approach to use perf_event_open system call. >>>>> >>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. >>>> >>>> I am trying to reproduce this double logging with CAP_PERFMON. >>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode. >>>> When running perf stat -a I am observing this AVC audit messages: >>>> >>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>> >>>> However there is no capability related messages around. I suppose my refpolicy should >>>> be modified somehow to observe capability related AVCs. >>>> >>>> Could you please comment or clarify on how to enable caps related AVCs in order >>>> to test the concerned logging. >>> >>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module. >> >> Yes, I already have it like this: >> common cap2 >> { >> <------>mac_override<--># unused by SELinux >> <------>mac_admin >> <------>syslog >> <------>wake_alarm >> <------>block_suspend >> <------>audit_read >> <------>perfmon >> } >> >> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials. >> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it. > > Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module. Yes, that all makes sense. selinux_capable() calls avc_audit() logging but cap_capable() doesn't, so proper order matters. I am doing debug tracing of the kernel code to reveal the exact reasons. ~Alexey
On 12.02.2020 18:45, Stephen Smalley wrote: > On 2/12/20 10:21 AM, Stephen Smalley wrote: >> On 2/12/20 8:53 AM, Alexey Budankov wrote: >>> On 12.02.2020 16:32, Stephen Smalley wrote: >>>> On 2/12/20 3:53 AM, Alexey Budankov wrote: >>>>> Hi Stephen, >>>>> >>>>> On 22.01.2020 17:07, Stephen Smalley wrote: >>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>>>>>> >>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>>>>>> >>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>>>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>>>>>> >>>>> <SNIP> >>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>>>>>> >>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >>>>>>> >>>>>>> So far so good, I suggest using the simplest version for v6: >>>>>>> >>>>>>> static inline bool perfmon_capable(void) >>>>>>> { >>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>>>>>> } >>>>>>> >>>>>>> It keeps the implementation simple and readable. The implementation is more >>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>>>>> privileged process. >>>>>>> >>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>>>>> based approach to use perf_event_open system call. >>>>>> >>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. >>>>> >>>>> I am trying to reproduce this double logging with CAP_PERFMON. >>>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode. >>>>> When running perf stat -a I am observing this AVC audit messages: >>>>> >>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>> >>>>> However there is no capability related messages around. I suppose my refpolicy should >>>>> be modified somehow to observe capability related AVCs. >>>>> >>>>> Could you please comment or clarify on how to enable caps related AVCs in order >>>>> to test the concerned logging. >>>> >>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module. >>> >>> Yes, I already have it like this: >>> common cap2 >>> { >>> <------>mac_override<--># unused by SELinux >>> <------>mac_admin >>> <------>syslog >>> <------>wake_alarm >>> <------>block_suspend >>> <------>audit_read >>> <------>perfmon >>> } >>> >>> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials. >>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it. >> >> Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module. > > Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that). It is user_t for local terminal session: ps -Z LABEL PID TTY TIME CMD user_u:user_r:user_t 11317 pts/9 00:00:00 bash user_u:user_r:user_t 11796 pts/9 00:00:00 ps For local terminal root session: ps -Z LABEL PID TTY TIME CMD user_u:user_r:user_su_t 2926 pts/3 00:00:00 bash user_u:user_r:user_su_t 10995 pts/3 00:00:00 ps For remote ssh session: ps -Z LABEL PID TTY TIME CMD user_u:user_r:user_t 7540 pts/8 00:00:00 ps user_u:user_r:user_systemd_t 8875 pts/8 00:00:00 bash ~Alexey
On 2/12/20 11:56 AM, Alexey Budankov wrote: > > > On 12.02.2020 18:45, Stephen Smalley wrote: >> On 2/12/20 10:21 AM, Stephen Smalley wrote: >>> On 2/12/20 8:53 AM, Alexey Budankov wrote: >>>> On 12.02.2020 16:32, Stephen Smalley wrote: >>>>> On 2/12/20 3:53 AM, Alexey Budankov wrote: >>>>>> Hi Stephen, >>>>>> >>>>>> On 22.01.2020 17:07, Stephen Smalley wrote: >>>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>>>>>>> >>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>>>>>>> >>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>>>>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>>>>>>> >>>>>> <SNIP> >>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>>>>>>> >>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >>>>>>>> >>>>>>>> So far so good, I suggest using the simplest version for v6: >>>>>>>> >>>>>>>> static inline bool perfmon_capable(void) >>>>>>>> { >>>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>>>>>>> } >>>>>>>> >>>>>>>> It keeps the implementation simple and readable. The implementation is more >>>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>>>>>> privileged process. >>>>>>>> >>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>>>>>> based approach to use perf_event_open system call. >>>>>>> >>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. >>>>>> >>>>>> I am trying to reproduce this double logging with CAP_PERFMON. >>>>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode. >>>>>> When running perf stat -a I am observing this AVC audit messages: >>>>>> >>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>>> >>>>>> However there is no capability related messages around. I suppose my refpolicy should >>>>>> be modified somehow to observe capability related AVCs. >>>>>> >>>>>> Could you please comment or clarify on how to enable caps related AVCs in order >>>>>> to test the concerned logging. >>>>> >>>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module. >>>> >>>> Yes, I already have it like this: >>>> common cap2 >>>> { >>>> <------>mac_override<--># unused by SELinux >>>> <------>mac_admin >>>> <------>syslog >>>> <------>wake_alarm >>>> <------>block_suspend >>>> <------>audit_read >>>> <------>perfmon >>>> } >>>> >>>> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials. >>>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it. >>> >>> Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module. >> >> Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that). > > It is user_t for local terminal session: > ps -Z > LABEL PID TTY TIME CMD > user_u:user_r:user_t 11317 pts/9 00:00:00 bash > user_u:user_r:user_t 11796 pts/9 00:00:00 ps > > For local terminal root session: > ps -Z > LABEL PID TTY TIME CMD > user_u:user_r:user_su_t 2926 pts/3 00:00:00 bash > user_u:user_r:user_su_t 10995 pts/3 00:00:00 ps > > For remote ssh session: > ps -Z > LABEL PID TTY TIME CMD > user_u:user_r:user_t 7540 pts/8 00:00:00 ps > user_u:user_r:user_systemd_t 8875 pts/8 00:00:00 bash That's a bug in either your policy or your userspace/distro integration. In any event, unless user_systemd_t is allowed all capability2 permissions by your policy, you should see the denials if CAP_PERFMON is set in the effective capability set of the process.
On 12.02.2020 20:09, Stephen Smalley wrote: > On 2/12/20 11:56 AM, Alexey Budankov wrote: >> >> >> On 12.02.2020 18:45, Stephen Smalley wrote: >>> On 2/12/20 10:21 AM, Stephen Smalley wrote: >>>> On 2/12/20 8:53 AM, Alexey Budankov wrote: >>>>> On 12.02.2020 16:32, Stephen Smalley wrote: >>>>>> On 2/12/20 3:53 AM, Alexey Budankov wrote: >>>>>>> Hi Stephen, >>>>>>> >>>>>>> On 22.01.2020 17:07, Stephen Smalley wrote: >>>>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote: >>>>>>>>> >>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote: >>>>>>>>>> >>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>>>>>>>>> <alexey.budankov@linux.intel.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>>>>>>>>> >>>>>>> <SNIP> >>>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>>>>>>>> >>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message. >>>>>>>>> >>>>>>>>> So far so good, I suggest using the simplest version for v6: >>>>>>>>> >>>>>>>>> static inline bool perfmon_capable(void) >>>>>>>>> { >>>>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >>>>>>>>> } >>>>>>>>> >>>>>>>>> It keeps the implementation simple and readable. The implementation is more >>>>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>>>>>>> privileged process. >>>>>>>>> >>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>>>>>>> based approach to use perf_event_open system call. >>>>>>>> >>>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. >>>>>>> >>>>>>> I am trying to reproduce this double logging with CAP_PERFMON. >>>>>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode. >>>>>>> When running perf stat -a I am observing this AVC audit messages: >>>>>>> >>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1 >>>>>>> >>>>>>> However there is no capability related messages around. I suppose my refpolicy should >>>>>>> be modified somehow to observe capability related AVCs. >>>>>>> >>>>>>> Could you please comment or clarify on how to enable caps related AVCs in order >>>>>>> to test the concerned logging. >>>>>> >>>>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module. >>>>> >>>>> Yes, I already have it like this: >>>>> common cap2 >>>>> { >>>>> <------>mac_override<--># unused by SELinux >>>>> <------>mac_admin >>>>> <------>syslog >>>>> <------>wake_alarm >>>>> <------>block_suspend >>>>> <------>audit_read >>>>> <------>perfmon >>>>> } >>>>> >>>>> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials. >>>>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it. >>>> >>>> Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module. >>> >>> Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that). >> >> It is user_t for local terminal session: >> ps -Z >> LABEL PID TTY TIME CMD >> user_u:user_r:user_t 11317 pts/9 00:00:00 bash >> user_u:user_r:user_t 11796 pts/9 00:00:00 ps >> >> For local terminal root session: >> ps -Z >> LABEL PID TTY TIME CMD >> user_u:user_r:user_su_t 2926 pts/3 00:00:00 bash >> user_u:user_r:user_su_t 10995 pts/3 00:00:00 ps >> >> For remote ssh session: >> ps -Z >> LABEL PID TTY TIME CMD >> user_u:user_r:user_t 7540 pts/8 00:00:00 ps >> user_u:user_r:user_systemd_t 8875 pts/8 00:00:00 bash > > That's a bug in either your policy or your userspace/distro integration. In any event, unless user_systemd_t is allowed all capability2 permissions by your policy, you should see the denials if CAP_PERFMON is set in the effective capability set of the process. > That all seems to be true. After instrumentation, rebuilding and rebooting, in CAP_PERFMON case: $ getcap perf perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep $ perf stat -a type=AVC msg=audit(1581580399.165:784): avc: denied { open } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580399.165:785): avc: denied { perfmon } for pid=8859 comm="perf" capability=38 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1 type=AVC msg=audit(1581580399.165:786): avc: denied { kernel } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580399.165:787): avc: denied { cpu } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580399.165:788): avc: denied { write } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580408.078:791): avc: denied { read } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 dmesg: [ 137.877713] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ? [ 137.877774] cread_has_capability(CAP_PERFMON) = 0 [ 137.877775] prior avc_audit(CAP_PERFMON) [ 137.877779] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0 [ 137.877784] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ? [ 137.877785] cread_has_capability(CAP_PERFMON) = 0 [ 137.877786] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0 [ 137.877794] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ? [ 137.877795] cread_has_capability(CAP_PERFMON) = 0 [ 137.877796] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0 ... in CAP_SYS_ADMIN case: $ getcap perf perf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep $ perf stat -a type=AVC msg=audit(1581580747.928:835): avc: denied { open } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580747.928:836): avc: denied { cpu } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580747.928:837): avc: denied { kernel } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580747.928:838): avc: denied { read } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580747.928:839): avc: denied { write } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 ... $ perf record -- ls ... type=AVC msg=audit(1581580747.930:843): avc: denied { sys_ptrace } for pid=8927 comm="perf" capability=19 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability permissive=1 ... dmesg: [ 276.714266] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ? [ 276.714268] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1 [ 276.714269] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ? [ 276.714270] cread_has_capability(CAP_SYS_ADMIN) = 0 [ 276.714270] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0 [ 276.714287] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ? [ 276.714287] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1 [ 276.714288] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ? [ 276.714288] cread_has_capability(CAP_SYS_ADMIN) = 0 [ 276.714289] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0 [ 276.714294] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ? [ 276.714295] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1 [ 276.714295] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ? [ 276.714296] cread_has_capability(CAP_SYS_ADMIN) = 0 [ 276.714296] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0 ... in unprivileged case: $ getcap perf perf = $ perf stat -a; perf record -a ... dmesg: [ 947.275611] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = ? [ 947.275613] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = -1 [ 947.275614] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ? [ 947.275615] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = -1 [ 947.275636] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = ? [ 947.275637] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = -1 [ 947.275638] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ? [ 947.275638] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = -1 ... So it looks like CAP_PERFMON and CAP_SYS_ADMIN are not ever logged by AVC simultaneously, in the current LSM and perfmon_capable() implementations. If perfmon is granted: perfmon is not logged by capabilities, perfmon is logged by AVC, no check for sys_admin by perfmon_capable(). If perfmon is not granted but sys_admin is granted: perfmon is not logged by capabilities, AVC logging is not called for perfmon, sys_admin is not logged by capabilities, sys_admin is not logged by AVC, for some intended reason? No caps are granted: AVC logging is not called either for perfmon or for sys_admin. BTW, is there a way to may be drop some AV cache so denials would appear in audit in the next AV access? Well, I guess you have initially mentioned some case similar to this (note that ids are not the same but pids= are): type=AVC msg=audit(1581580399.165:784): avc: denied { open } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580399.165:785): avc: denied { perfmon } for pid=8859 comm="perf" capability=38 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1 type=AVC msg=audit( . : ): avc: denied { sys_admin } for pid=8859 comm="perf" capability=21 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1 type=AVC msg=audit(1581580399.165:786): avc: denied { kernel } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580399.165:787): avc: denied { cpu } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580399.165:788): avc: denied { write } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 type=AVC msg=audit(1581580408.078:791): avc: denied { read } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1 So the message could be like this: "If audit logs for a process using perf_events related syscalls i.e. perf_event_open(), read(), write(), ioctl(), mmap() contain denials both for CAP_PERFMON and CAP_SYS_ADMIN capabilities then providing the process with CAP_PERFMON capability singly is the secure preferred approach to resolve access denials to performance monitoring and observability operations." ~Alexey
On 07.02.2020 16:39, Alexey Budankov wrote: > > On 07.02.2020 14:38, Thomas Gleixner wrote: >> Alexey Budankov <alexey.budankov@linux.intel.com> writes: >>> On 22.01.2020 17:25, Alexey Budankov wrote: >>>> On 22.01.2020 17:07, Stephen Smalley wrote: >>>>>> It keeps the implementation simple and readable. The implementation is more >>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON >>>>>> privileged process. >>>>>> >>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON >>>>>> based approach to use perf_event_open system call. >>>>> >>>>> I can live with that. We just need to document that when you see >>>>> both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, >>>>> try only allowing CAP_PERFMON first and see if that resolves the >>>>> issue. We have a similar issue with CAP_DAC_READ_SEARCH versus >>>>> CAP_DAC_OVERRIDE. >>>> >>>> perf security [1] document can be updated, at least, to align and document >>>> this audit logging specifics. >>> >>> And I plan to update the document right after this patch set is accepted. >>> Feel free to let me know of the places in the kernel docs that also >>> require update w.r.t CAP_PERFMON extension. >> >> The documentation update wants be part of the patch set and not planned >> to be done _after_ the patch set is merged. > > Well, accepted. It is going to make patches #11 and beyond. Patches #11 and #12 of v7 [1] contain information on CAP_PERFMON intention and usage. Patch for man-pages [2] extends perf_event_open.2 documentation. Thanks, Alexey --- [1] https://lore.kernel.org/lkml/c8de937a-0b3a-7147-f5ef-69f467e87a13@linux.intel.com/ [2] https://lore.kernel.org/lkml/18d1083d-efe5-f5f8-c531-d142c0e5c1a8@linux.intel.com/
diff --git a/include/linux/capability.h b/include/linux/capability.h index ecce0f43c73a..8784969d91e1 100644 --- a/include/linux/capability.h +++ b/include/linux/capability.h @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); +static inline bool perfmon_capable(void) +{ + struct user_namespace *ns = &init_user_ns; + + if (ns_capable_noaudit(ns, CAP_PERFMON)) + return ns_capable(ns, CAP_PERFMON); + + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) + return ns_capable(ns, CAP_SYS_ADMIN); + + return false; +} /* audit system wants to get cap info from files as well */ extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps); diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h index 240fdb9a60f6..8b416e5f3afa 100644 --- a/include/uapi/linux/capability.h +++ b/include/uapi/linux/capability.h @@ -366,8 +366,14 @@ struct vfs_ns_cap_data { #define CAP_AUDIT_READ 37 +/* + * Allow system performance and observability privileged operations + * using perf_events, i915_perf and other kernel subsystems + */ + +#define CAP_PERFMON 38 -#define CAP_LAST_CAP CAP_AUDIT_READ +#define CAP_LAST_CAP CAP_PERFMON #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h index 7db24855e12d..c599b0c2b0e7 100644 --- a/security/selinux/include/classmap.h +++ b/security/selinux/include/classmap.h @@ -27,9 +27,9 @@ "audit_control", "setfcap" #define COMMON_CAP2_PERMS "mac_override", "mac_admin", "syslog", \ - "wake_alarm", "block_suspend", "audit_read" + "wake_alarm", "block_suspend", "audit_read", "perfmon" -#if CAP_LAST_CAP > CAP_AUDIT_READ +#if CAP_LAST_CAP > CAP_PERFMON #error New capability defined, please update COMMON_CAP2_PERMS. #endif
Introduce CAP_PERFMON capability designed to secure system performance monitoring and observability operations so that CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf and other performance monitoring and observability subsystems. CAP_PERFMON intends to harden system security and integrity during system performance monitoring and observability operations by decreasing attack surface that is available to a CAP_SYS_ADMIN privileged process [1]. Providing access to system performance monitoring and observability operations under CAP_PERFMON capability singly, without the rest of CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and makes operation more secure. CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to system performance monitoring and observability operations and balance amount of CAP_SYS_ADMIN credentials following the recommendations in the capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is overloaded; see Notes to kernel developers, below." Although the software running under CAP_PERFMON can not ensure avoidance of related hardware issues, the software can still mitigate these issues following the official embargoed hardware issues mitigation procedure [2]. The bugs in the software itself could be fixed following the standard kernel development process [3] to maintain and harden security of system performance monitoring and observability operations. [1] http://man7.org/linux/man-pages/man7/capabilities.7.html [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> --- include/linux/capability.h | 12 ++++++++++++ include/uapi/linux/capability.h | 8 +++++++- security/selinux/include/classmap.h | 4 ++-- 3 files changed, 21 insertions(+), 3 deletions(-)