ima: Handle -ESTALE returned by ima_filter_rule_match()

Message ID	20220818020551.18922-1-guozihua@huawei.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-integrity-owner@kernel.org> From: GUO Zihua <guozihua@huawei.com> To: <linux-integrity@vger.kernel.org>, <zohar@linux.ibm.com>, <dmitry.kasatkin@gmail.com>, <paul@paul-moore.com> Subject: [PATCH] ima: Handle -ESTALE returned by ima_filter_rule_match() Date: Thu, 18 Aug 2022 10:05:51 +0800 Message-ID: <20220818020551.18922-1-guozihua@huawei.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk
Series	ima: Handle -ESTALE returned by ima_filter_rule_match() \| expand ima: Handle -ESTALE returned by ima_filter_rule_match()

Guozihua (Scott) Aug. 18, 2022, 2:05 a.m. UTC

IMA relies on lsm policy update notifier to be notified when it should
update it's lsm rules.

When SELinux update it's policies, ima would be notified and starts
updating all its lsm rules one-by-one. During this time, -ESTALE would
be returned by ima_filter_rule_match() if it is called with a lsm rule
that has not yet been updated. In ima_match_rules(), -ESTALE is not
handled, and the lsm rule is considered a match, causing extra files
be measured by IMA.

Fix it by retrying for at most three times if -ESTALE is returned by
ima_filter_rule_match().

Fixes: b16942455193 ("ima: use the lsm policy update notifier")
Signed-off-by: GUO Zihua <guozihua@huawei.com>
---
 security/integrity/ima/ima_policy.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Mimi Zohar Aug. 18, 2022, 1:43 p.m. UTC | #1

Hi Scott,

On Thu, 2022-08-18 at 10:05 +0800, GUO Zihua wrote:
> IMA relies on lsm policy update notifier to be notified when it should
> update it's lsm rules.

^IMA relies on the blocking LSM policy notifier callback to update the
LSM based IMA policy rules.

> When SELinux update it's policies, ima would be notified and starts
> updating all its lsm rules one-by-one. During this time, -ESTALE would
> be returned by ima_filter_rule_match() if it is called with a lsm rule
> that has not yet been updated. In ima_match_rules(), -ESTALE is not
> handled, and the lsm rule is considered a match, causing extra files
> be measured by IMA.
> 
> Fix it by retrying for at most three times if -ESTALE is returned by
> ima_filter_rule_match().

With the lazy LSM policy update, retrying only once was needed.  With
the blocking LSM notifier callback, why is three times needed?  Is this
really a function of how long it takes IMA to walk and update ALL the
LSM based IMA policy rules?  Would having SELinux wait for the -ESTALE
to change do anything?

> 
> Fixes: b16942455193 ("ima: use the lsm policy update notifier")
> Signed-off-by: GUO Zihua <guozihua@huawei.com>

thanks,

Mimi

Guozihua (Scott) Aug. 19, 2022, 1:50 a.m. UTC | #2

On 2022/8/18 21:43, Mimi Zohar wrote:
> Hi Scott,
> 
> On Thu, 2022-08-18 at 10:05 +0800, GUO Zihua wrote:
>> IMA relies on lsm policy update notifier to be notified when it should
>> update it's lsm rules.
> 
> ^IMA relies on the blocking LSM policy notifier callback to update the
> LSM based IMA policy rules.

I'll fix this in the next version.
> 
>> When SELinux update it's policies, ima would be notified and starts
>> updating all its lsm rules one-by-one. During this time, -ESTALE would
>> be returned by ima_filter_rule_match() if it is called with a lsm rule
>> that has not yet been updated. In ima_match_rules(), -ESTALE is not
>> handled, and the lsm rule is considered a match, causing extra files
>> be measured by IMA.
>>
>> Fix it by retrying for at most three times if -ESTALE is returned by
>> ima_filter_rule_match().
> 
> With the lazy LSM policy update, retrying only once was needed.  With
> the blocking LSM notifier callback, why is three times needed?  Is this
> really a function of how long it takes IMA to walk and update ALL the
> LSM based IMA policy rules?  Would having SELinux wait for the -ESTALE
> to change do anything?

With lazy policy update, policy update is triggered and would be 
finished before retrying. However, with a notifier callback, the update 
runs in a different process which might introduce extra latency. 
Technically if one rule has been updated, any following rules would have 
been updated at the time they are read as well, thus the retry should 
happen on the first rule affected by SELinux policy update only. 
Retrying for three times here would leave some time for the notifier to 
finish it's job on updating the rules.
>>
>> Fixes: b16942455193 ("ima: use the lsm policy update notifier")
>> Signed-off-by: GUO Zihua <guozihua@huawei.com>
> 
> thanks,
> 
> Mimi
> 
> .

Mimi Zohar Aug. 22, 2022, 2:41 p.m. UTC | #3

On Fri, 2022-08-19 at 09:50 +0800, Guozihua (Scott) wrote:
> On 2022/8/18 21:43, Mimi Zohar wrote:
> > Hi Scott,
> > 
> > On Thu, 2022-08-18 at 10:05 +0800, GUO Zihua wrote:
> >> IMA relies on lsm policy update notifier to be notified when it should
> >> update it's lsm rules.
> > 
> > ^IMA relies on the blocking LSM policy notifier callback to update the
> > LSM based IMA policy rules.
> 
> I'll fix this in the next version.

Thanks.

> > 
> >> When SELinux update it's policies, ima would be notified and starts
> >> updating all its lsm rules one-by-one. During this time, -ESTALE would
> >> be returned by ima_filter_rule_match() if it is called with a lsm rule
> >> that has not yet been updated. In ima_match_rules(), -ESTALE is not
> >> handled, and the lsm rule is considered a match, causing extra files
> >> be measured by IMA.
> >>
> >> Fix it by retrying for at most three times if -ESTALE is returned by
> >> ima_filter_rule_match().
> > 
> > With the lazy LSM policy update, retrying only once was needed.  With
> > the blocking LSM notifier callback, why is three times needed?  Is this
> > really a function of how long it takes IMA to walk and update ALL the
> > LSM based IMA policy rules?  Would having SELinux wait for the -ESTALE
> > to change do anything?
> 
> With lazy policy update, policy update is triggered and would be 
> finished before retrying. However, with a notifier callback, the update 
> runs in a different process which might introduce extra latency. 
> Technically if one rule has been updated, any following rules would have 
> been updated at the time they are read as well, thus the retry should 
> happen on the first rule affected by SELinux policy update only. 
> Retrying for three times here would leave some time for the notifier to 
> finish it's job on updating the rules.

The question is whether we're waiting for the SELinux policy to change
from ESTALE or whether it is the number of SELinux based IMA policy
rules or some combination of the two.  Retrying three times seems to be
random.  If SELinux waited for ESTALE to change, then it would only be
dependent on the time it took to update the SELinux based IMA policy
rules.

thanks,

Mimi

> >>
> >> Fixes: b16942455193 ("ima: use the lsm policy update notifier")
> >> Signed-off-by: GUO Zihua <guozihua@huawei.com>

Guozihua (Scott) Aug. 23, 2022, 8:12 a.m. UTC | #4

On 2022/8/22 22:41, Mimi Zohar wrote:
> On Fri, 2022-08-19 at 09:50 +0800, Guozihua (Scott) wrote:
>> On 2022/8/18 21:43, Mimi Zohar wrote:
>>> Hi Scott,
>>>
>>> On Thu, 2022-08-18 at 10:05 +0800, GUO Zihua wrote:
>>>> IMA relies on lsm policy update notifier to be notified when it should
>>>> update it's lsm rules.
>>>
>>> ^IMA relies on the blocking LSM policy notifier callback to update the
>>> LSM based IMA policy rules.
>>
>> I'll fix this in the next version.
> 
> Thanks.
> 
>>>
>>>> When SELinux update it's policies, ima would be notified and starts
>>>> updating all its lsm rules one-by-one. During this time, -ESTALE would
>>>> be returned by ima_filter_rule_match() if it is called with a lsm rule
>>>> that has not yet been updated. In ima_match_rules(), -ESTALE is not
>>>> handled, and the lsm rule is considered a match, causing extra files
>>>> be measured by IMA.
>>>>
>>>> Fix it by retrying for at most three times if -ESTALE is returned by
>>>> ima_filter_rule_match().
>>>
>>> With the lazy LSM policy update, retrying only once was needed.  With
>>> the blocking LSM notifier callback, why is three times needed?  Is this
>>> really a function of how long it takes IMA to walk and update ALL the
>>> LSM based IMA policy rules?  Would having SELinux wait for the -ESTALE
>>> to change do anything?
>>
>> With lazy policy update, policy update is triggered and would be
>> finished before retrying. However, with a notifier callback, the update
>> runs in a different process which might introduce extra latency.
>> Technically if one rule has been updated, any following rules would have
>> been updated at the time they are read as well, thus the retry should
>> happen on the first rule affected by SELinux policy update only.
>> Retrying for three times here would leave some time for the notifier to
>> finish it's job on updating the rules.
> 
> The question is whether we're waiting for the SELinux policy to change
> from ESTALE or whether it is the number of SELinux based IMA policy
> rules or some combination of the two.  Retrying three times seems to be
> random.  If SELinux waited for ESTALE to change, then it would only be
> dependent on the time it took to update the SELinux based IMA policy
> rules.

We are waiting for ima_lsm_update_rules() to finish re-initializing all 
the LSM based rules.

Once new policy takes effect in SELinux, the policy sequence number 
would be incremented. During rule match, this sequence number is checked 
and if mismatched, -ESTALE is returned and the rules should be 
re-initialized. Normally during this time, ima_lsm_update_rules should 
be running already, so we are going to wait for it to finish.
> 
> thanks,
> 
> Mimi
> 
>>>>
>>>> Fixes: b16942455193 ("ima: use the lsm policy update notifier")
>>>> Signed-off-by: GUO Zihua <guozihua@huawei.com>
> 
> .

Mimi Zohar Aug. 23, 2022, 1:21 p.m. UTC | #5

On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
> > The question is whether we're waiting for the SELinux policy to change
> > from ESTALE or whether it is the number of SELinux based IMA policy
> > rules or some combination of the two.  Retrying three times seems to be
> > random.  If SELinux waited for ESTALE to change, then it would only be
> > dependent on the time it took to update the SELinux based IMA policy
> > rules.
> 
> We are waiting for ima_lsm_update_rules() to finish re-initializing all 
> the LSM based rules.

Fine.  Hopefully retrying a maximum of 3 times is sufficient.

Guozihua (Scott) Aug. 23, 2022, 1:28 p.m. UTC | #6

On 2022/8/23 21:21, Mimi Zohar wrote:
> On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
>>> The question is whether we're waiting for the SELinux policy to change
>>> from ESTALE or whether it is the number of SELinux based IMA policy
>>> rules or some combination of the two.  Retrying three times seems to be
>>> random.  If SELinux waited for ESTALE to change, then it would only be
>>> dependent on the time it took to update the SELinux based IMA policy
>>> rules.
>>
>> We are waiting for ima_lsm_update_rules() to finish re-initializing all
>> the LSM based rules.
> 
> Fine.  Hopefully retrying a maximum of 3 times is sufficient.
> 
Well, at least this should greatly reduce the chance of this issue from 
happening. This would be the best we I can think of without locking and 
busy waiting. Maybe we can also add delays before we retry. Maybe you 
got any other thought in mind?

Mimi Zohar Aug. 24, 2022, 1:26 a.m. UTC | #7

On Tue, 2022-08-23 at 21:28 +0800, Guozihua (Scott) wrote:
> On 2022/8/23 21:21, Mimi Zohar wrote:
> > On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
> >>> The question is whether we're waiting for the SELinux policy to change
> >>> from ESTALE or whether it is the number of SELinux based IMA policy
> >>> rules or some combination of the two.  Retrying three times seems to be
> >>> random.  If SELinux waited for ESTALE to change, then it would only be
> >>> dependent on the time it took to update the SELinux based IMA policy
> >>> rules.
> >>
> >> We are waiting for ima_lsm_update_rules() to finish re-initializing all
> >> the LSM based rules.
> > 
> > Fine.  Hopefully retrying a maximum of 3 times is sufficient.
> > 
> Well, at least this should greatly reduce the chance of this issue from 
> happening.

Agreed

> This would be the best we I can think of without locking and 
> busy waiting. Maybe we can also add delays before we retry. Maybe you 
> got any other thought in mind?

Another option would be to re-introduce the equivalent of the "lazy"
LSM update on -ESTALE, but without updating the policy rule, as the
notifier callback will eventually get to it.

Guozihua (Scott) Aug. 24, 2022, 1:56 a.m. UTC | #8

On 2022/8/24 9:26, Mimi Zohar wrote:
> On Tue, 2022-08-23 at 21:28 +0800, Guozihua (Scott) wrote:
>> On 2022/8/23 21:21, Mimi Zohar wrote:
>>> On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
>>>>> The question is whether we're waiting for the SELinux policy to change
>>>>> from ESTALE or whether it is the number of SELinux based IMA policy
>>>>> rules or some combination of the two.  Retrying three times seems to be
>>>>> random.  If SELinux waited for ESTALE to change, then it would only be
>>>>> dependent on the time it took to update the SELinux based IMA policy
>>>>> rules.
>>>>
>>>> We are waiting for ima_lsm_update_rules() to finish re-initializing all
>>>> the LSM based rules.
>>>
>>> Fine.  Hopefully retrying a maximum of 3 times is sufficient.
>>>
>> Well, at least this should greatly reduce the chance of this issue from
>> happening.
> 
> Agreed
> 
>> This would be the best we I can think of without locking and
>> busy waiting. Maybe we can also add delays before we retry. Maybe you
>> got any other thought in mind?
> 
> Another option would be to re-introduce the equivalent of the "lazy"
> LSM update on -ESTALE, but without updating the policy rule, as the
> notifier callback will eventually get to it.
> 

For this to happen we would need a way to tell when we are able to 
continue with the retry though.

Mimi Zohar Aug. 25, 2022, 1:02 p.m. UTC | #9

On Wed, 2022-08-24 at 09:56 +0800, Guozihua (Scott) wrote:
> On 2022/8/24 9:26, Mimi Zohar wrote:
> > On Tue, 2022-08-23 at 21:28 +0800, Guozihua (Scott) wrote:
> >> On 2022/8/23 21:21, Mimi Zohar wrote:
> >>> On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
> >>>>> The question is whether we're waiting for the SELinux policy to change
> >>>>> from ESTALE or whether it is the number of SELinux based IMA policy
> >>>>> rules or some combination of the two.  Retrying three times seems to be
> >>>>> random.  If SELinux waited for ESTALE to change, then it would only be
> >>>>> dependent on the time it took to update the SELinux based IMA policy
> >>>>> rules.
> >>>>
> >>>> We are waiting for ima_lsm_update_rules() to finish re-initializing all
> >>>> the LSM based rules.
> >>>
> >>> Fine.  Hopefully retrying a maximum of 3 times is sufficient.
> >>>
> >> Well, at least this should greatly reduce the chance of this issue from
> >> happening.
> > 
> > Agreed
> > 
> >> This would be the best we I can think of without locking and
> >> busy waiting. Maybe we can also add delays before we retry. Maybe you
> >> got any other thought in mind?
> > 
> > Another option would be to re-introduce the equivalent of the "lazy"
> > LSM update on -ESTALE, but without updating the policy rule, as the
> > notifier callback will eventually get to it.
> > 
> 
> For this to happen we would need a way to tell when we are able to 
> continue with the retry though.

Previously with the lazy update, on failure security_filter_rule_init()
was called before the retry.  To avoid locking or detecting when to
continue, another option would be to call to
security_filter_rule_init() with a local copy of the rule.  The retry
would be based on a local copy of the rule.

Eventually the registered callback will complete, so we don't need to
be concerned about updating the actual rules.

Guozihua (Scott) Aug. 27, 2022, 9:57 a.m. UTC | #10

On 2022/8/25 21:02, Mimi Zohar wrote:
> On Wed, 2022-08-24 at 09:56 +0800, Guozihua (Scott) wrote:
>> On 2022/8/24 9:26, Mimi Zohar wrote:
>>> On Tue, 2022-08-23 at 21:28 +0800, Guozihua (Scott) wrote:
>>>> On 2022/8/23 21:21, Mimi Zohar wrote:
>>>>> On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
>>>>>>> The question is whether we're waiting for the SELinux policy to change
>>>>>>> from ESTALE or whether it is the number of SELinux based IMA policy
>>>>>>> rules or some combination of the two.  Retrying three times seems to be
>>>>>>> random.  If SELinux waited for ESTALE to change, then it would only be
>>>>>>> dependent on the time it took to update the SELinux based IMA policy
>>>>>>> rules.
>>>>>>
>>>>>> We are waiting for ima_lsm_update_rules() to finish re-initializing all
>>>>>> the LSM based rules.
>>>>>
>>>>> Fine.  Hopefully retrying a maximum of 3 times is sufficient.
>>>>>
>>>> Well, at least this should greatly reduce the chance of this issue from
>>>> happening.
>>>
>>> Agreed
>>>
>>>> This would be the best we I can think of without locking and
>>>> busy waiting. Maybe we can also add delays before we retry. Maybe you
>>>> got any other thought in mind?
>>>
>>> Another option would be to re-introduce the equivalent of the "lazy"
>>> LSM update on -ESTALE, but without updating the policy rule, as the
>>> notifier callback will eventually get to it.
>>>
>>
>> For this to happen we would need a way to tell when we are able to
>> continue with the retry though.
> 
> Previously with the lazy update, on failure security_filter_rule_init()
> was called before the retry.  To avoid locking or detecting when to
> continue, another option would be to call to
> security_filter_rule_init() with a local copy of the rule.  The retry
> would be based on a local copy of the rule.
> 
> Eventually the registered callback will complete, so we don't need to
> be concerned about updating the actual rules.

Is it possible to cause race condition though? With this, the notifier 
path seems to be unnecessary.

Mimi Zohar Aug. 30, 2022, 1:20 a.m. UTC | #11

On Sat, 2022-08-27 at 17:57 +0800, Guozihua (Scott) wrote:
> On 2022/8/25 21:02, Mimi Zohar wrote:
> > On Wed, 2022-08-24 at 09:56 +0800, Guozihua (Scott) wrote:
> >> On 2022/8/24 9:26, Mimi Zohar wrote:
> >>> On Tue, 2022-08-23 at 21:28 +0800, Guozihua (Scott) wrote:
> >>>> On 2022/8/23 21:21, Mimi Zohar wrote:
> >>>>> On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
> >>>>>>> The question is whether we're waiting for the SELinux policy to change
> >>>>>>> from ESTALE or whether it is the number of SELinux based IMA policy
> >>>>>>> rules or some combination of the two.  Retrying three times seems to be
> >>>>>>> random.  If SELinux waited for ESTALE to change, then it would only be
> >>>>>>> dependent on the time it took to update the SELinux based IMA policy
> >>>>>>> rules.
> >>>>>>
> >>>>>> We are waiting for ima_lsm_update_rules() to finish re-initializing all
> >>>>>> the LSM based rules.
> >>>>>
> >>>>> Fine.  Hopefully retrying a maximum of 3 times is sufficient.
> >>>>>
> >>>> Well, at least this should greatly reduce the chance of this issue from
> >>>> happening.
> >>>
> >>> Agreed
> >>>
> >>>> This would be the best we I can think of without locking and
> >>>> busy waiting. Maybe we can also add delays before we retry. Maybe you
> >>>> got any other thought in mind?
> >>>
> >>> Another option would be to re-introduce the equivalent of the "lazy"
> >>> LSM update on -ESTALE, but without updating the policy rule, as the
> >>> notifier callback will eventually get to it.
> >>>
> >>
> >> For this to happen we would need a way to tell when we are able to
> >> continue with the retry though.
> > 
> > Previously with the lazy update, on failure security_filter_rule_init()
> > was called before the retry.  To avoid locking or detecting when to
> > continue, another option would be to call to
> > security_filter_rule_init() with a local copy of the rule.  The retry
> > would be based on a local copy of the rule.
> > 
> > Eventually the registered callback will complete, so we don't need to
> > be concerned about updating the actual rules.
> 
> Is it possible to cause race condition though? With this, the notifier 
> path seems to be unnecessary.

I don't see how there would be a race condition.  The notifier callback
is the normal method of updating the policy rules.  Hopefully -ESTALE
isn't something that happens frequently.

Guozihua (Scott) Aug. 30, 2022, 8:41 a.m. UTC | #12

On 2022/8/30 9:20, Mimi Zohar wrote:
> On Sat, 2022-08-27 at 17:57 +0800, Guozihua (Scott) wrote:
>> On 2022/8/25 21:02, Mimi Zohar wrote:
>>> On Wed, 2022-08-24 at 09:56 +0800, Guozihua (Scott) wrote:
>>>> On 2022/8/24 9:26, Mimi Zohar wrote:
>>>>> On Tue, 2022-08-23 at 21:28 +0800, Guozihua (Scott) wrote:
>>>>>> On 2022/8/23 21:21, Mimi Zohar wrote:
>>>>>>> On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
>>>>>>>>> The question is whether we're waiting for the SELinux policy to change
>>>>>>>>> from ESTALE or whether it is the number of SELinux based IMA policy
>>>>>>>>> rules or some combination of the two.  Retrying three times seems to be
>>>>>>>>> random.  If SELinux waited for ESTALE to change, then it would only be
>>>>>>>>> dependent on the time it took to update the SELinux based IMA policy
>>>>>>>>> rules.
>>>>>>>>
>>>>>>>> We are waiting for ima_lsm_update_rules() to finish re-initializing all
>>>>>>>> the LSM based rules.
>>>>>>>
>>>>>>> Fine.  Hopefully retrying a maximum of 3 times is sufficient.
>>>>>>>
>>>>>> Well, at least this should greatly reduce the chance of this issue from
>>>>>> happening.
>>>>>
>>>>> Agreed
>>>>>
>>>>>> This would be the best we I can think of without locking and
>>>>>> busy waiting. Maybe we can also add delays before we retry. Maybe you
>>>>>> got any other thought in mind?
>>>>>
>>>>> Another option would be to re-introduce the equivalent of the "lazy"
>>>>> LSM update on -ESTALE, but without updating the policy rule, as the
>>>>> notifier callback will eventually get to it.
>>>>>
>>>>
>>>> For this to happen we would need a way to tell when we are able to
>>>> continue with the retry though.
>>>
>>> Previously with the lazy update, on failure security_filter_rule_init()
>>> was called before the retry.  To avoid locking or detecting when to
>>> continue, another option would be to call to
>>> security_filter_rule_init() with a local copy of the rule.  The retry
>>> would be based on a local copy of the rule.
>>>
>>> Eventually the registered callback will complete, so we don't need to
>>> be concerned about updating the actual rules.
>>
>> Is it possible to cause race condition though? With this, the notifier
>> path seems to be unnecessary.
> 
> I don't see how there would be a race condition.  The notifier callback
> is the normal method of updating the policy rules.  Hopefully -ESTALE
> isn't something that happens frequently.

The notifier callback uses RCU to update rules, I think we should mimic 
that behavior if we are to update individual rules in the matching logic.

Mimi Zohar Aug. 30, 2022, 12:03 p.m. UTC | #13

On Tue, 2022-08-30 at 16:41 +0800, Guozihua (Scott) wrote:
> On 2022/8/30 9:20, Mimi Zohar wrote:
> > On Sat, 2022-08-27 at 17:57 +0800, Guozihua (Scott) wrote:
> >> On 2022/8/25 21:02, Mimi Zohar wrote:
> >>> On Wed, 2022-08-24 at 09:56 +0800, Guozihua (Scott) wrote:
> >>>> On 2022/8/24 9:26, Mimi Zohar wrote:
> >>>>> On Tue, 2022-08-23 at 21:28 +0800, Guozihua (Scott) wrote:
> >>>>>> On 2022/8/23 21:21, Mimi Zohar wrote:
> >>>>>>> On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
> >>>>>>>>> The question is whether we're waiting for the SELinux policy to change
> >>>>>>>>> from ESTALE or whether it is the number of SELinux based IMA policy
> >>>>>>>>> rules or some combination of the two.  Retrying three times seems to be
> >>>>>>>>> random.  If SELinux waited for ESTALE to change, then it would only be
> >>>>>>>>> dependent on the time it took to update the SELinux based IMA policy
> >>>>>>>>> rules.
> >>>>>>>>
> >>>>>>>> We are waiting for ima_lsm_update_rules() to finish re-initializing all
> >>>>>>>> the LSM based rules.
> >>>>>>>
> >>>>>>> Fine.  Hopefully retrying a maximum of 3 times is sufficient.
> >>>>>>>
> >>>>>> Well, at least this should greatly reduce the chance of this issue from
> >>>>>> happening.
> >>>>>
> >>>>> Agreed
> >>>>>
> >>>>>> This would be the best we I can think of without locking and
> >>>>>> busy waiting. Maybe we can also add delays before we retry. Maybe you
> >>>>>> got any other thought in mind?
> >>>>>
> >>>>> Another option would be to re-introduce the equivalent of the "lazy"
> >>>>> LSM update on -ESTALE, but without updating the policy rule, as the
> >>>>> notifier callback will eventually get to it.
> >>>>>
> >>>>
> >>>> For this to happen we would need a way to tell when we are able to
> >>>> continue with the retry though.
> >>>
> >>> Previously with the lazy update, on failure security_filter_rule_init()
> >>> was called before the retry.  To avoid locking or detecting when to
> >>> continue, another option would be to call to
> >>> security_filter_rule_init() with a local copy of the rule.  The retry
> >>> would be based on a local copy of the rule.
> >>>
> >>> Eventually the registered callback will complete, so we don't need to
> >>> be concerned about updating the actual rules.
> >>
> >> Is it possible to cause race condition though? With this, the notifier
> >> path seems to be unnecessary.
> > 
> > I don't see how there would be a race condition.  The notifier callback
> > is the normal method of updating the policy rules.  Hopefully -ESTALE
> > isn't something that happens frequently.
> 
> The notifier callback uses RCU to update rules, I think we should mimic 
> that behavior if we are to update individual rules in the matching logic.

If the callback update hasn't completed causing an -ESTALE, the
fallback is to directly query the LSM for a single IMA policy rule. 
Please keep it simple.

Guozihua (Scott) Aug. 30, 2022, 12:13 p.m. UTC | #14

On 2022/8/30 20:03, Mimi Zohar wrote:
> On Tue, 2022-08-30 at 16:41 +0800, Guozihua (Scott) wrote:
>> On 2022/8/30 9:20, Mimi Zohar wrote:
>>> On Sat, 2022-08-27 at 17:57 +0800, Guozihua (Scott) wrote:
>>>> On 2022/8/25 21:02, Mimi Zohar wrote:
>>>>> On Wed, 2022-08-24 at 09:56 +0800, Guozihua (Scott) wrote:
>>>>>> On 2022/8/24 9:26, Mimi Zohar wrote:
>>>>>>> On Tue, 2022-08-23 at 21:28 +0800, Guozihua (Scott) wrote:
>>>>>>>> On 2022/8/23 21:21, Mimi Zohar wrote:
>>>>>>>>> On Tue, 2022-08-23 at 16:12 +0800, Guozihua (Scott) wrote:
>>>>>>>>>>> The question is whether we're waiting for the SELinux policy to change
>>>>>>>>>>> from ESTALE or whether it is the number of SELinux based IMA policy
>>>>>>>>>>> rules or some combination of the two.  Retrying three times seems to be
>>>>>>>>>>> random.  If SELinux waited for ESTALE to change, then it would only be
>>>>>>>>>>> dependent on the time it took to update the SELinux based IMA policy
>>>>>>>>>>> rules.
>>>>>>>>>>
>>>>>>>>>> We are waiting for ima_lsm_update_rules() to finish re-initializing all
>>>>>>>>>> the LSM based rules.
>>>>>>>>>
>>>>>>>>> Fine.  Hopefully retrying a maximum of 3 times is sufficient.
>>>>>>>>>
>>>>>>>> Well, at least this should greatly reduce the chance of this issue from
>>>>>>>> happening.
>>>>>>>
>>>>>>> Agreed
>>>>>>>
>>>>>>>> This would be the best we I can think of without locking and
>>>>>>>> busy waiting. Maybe we can also add delays before we retry. Maybe you
>>>>>>>> got any other thought in mind?
>>>>>>>
>>>>>>> Another option would be to re-introduce the equivalent of the "lazy"
>>>>>>> LSM update on -ESTALE, but without updating the policy rule, as the
>>>>>>> notifier callback will eventually get to it.
>>>>>>>
>>>>>>
>>>>>> For this to happen we would need a way to tell when we are able to
>>>>>> continue with the retry though.
>>>>>
>>>>> Previously with the lazy update, on failure security_filter_rule_init()
>>>>> was called before the retry.  To avoid locking or detecting when to
>>>>> continue, another option would be to call to
>>>>> security_filter_rule_init() with a local copy of the rule.  The retry
>>>>> would be based on a local copy of the rule.
>>>>>
>>>>> Eventually the registered callback will complete, so we don't need to
>>>>> be concerned about updating the actual rules.
>>>>
>>>> Is it possible to cause race condition though? With this, the notifier
>>>> path seems to be unnecessary.
>>>
>>> I don't see how there would be a race condition.  The notifier callback
>>> is the normal method of updating the policy rules.  Hopefully -ESTALE
>>> isn't something that happens frequently.
>>
>> The notifier callback uses RCU to update rules, I think we should mimic
>> that behavior if we are to update individual rules in the matching logic.
> 
> If the callback update hasn't completed causing an -ESTALE, the
> fallback is to directly query the LSM for a single IMA policy rule.
> Please keep it simple.
> 

Got it, I'll send a new patch.

ima: Handle -ESTALE returned by ima_filter_rule_match()

Commit Message

Comments

Patch