diff mbox series

mm: Check writable zero page in page table check

Message ID 20220902232732.12358-1-rick.p.edgecombe@intel.com (mailing list archive)
State New
Headers show
Series mm: Check writable zero page in page table check | expand

Commit Message

Edgecombe, Rick P Sept. 2, 2022, 11:27 p.m. UTC
The zero page should remain all zero, so that it can be mapped as
read-only for read faults of memory that should be zeroed. If it is ever
mapped writable to userspace, it could become non-zero and so other apps
would unexpectedly get non-zero data. So the zero page should never be
mapped writable to userspace. Check for this condition in
page_table_check_set().

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

---

Hi,

CONFIG_PAGE_TABLE_CHECK is pretty explicit about what it checks (and
doesn't mention the zero page), but this condition seems to fit with the
general category of "pages mapped wrongly to userspace". I added it
locally to help me debug something. Maybe it's more widely useful.

 mm/page_table_check.c | 2 ++
 1 file changed, 2 insertions(+)


base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5

Comments

Huang, Shaoqin Sept. 3, 2022, 2:13 a.m. UTC | #1
On 9/3/2022 7:27 AM, Rick Edgecombe wrote:
> The zero page should remain all zero, so that it can be mapped as
> read-only for read faults of memory that should be zeroed. If it is ever
> mapped writable to userspace, it could become non-zero and so other apps
> would unexpectedly get non-zero data. So the zero page should never be
> mapped writable to userspace. Check for this condition in
> page_table_check_set().
> 
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> 
> ---
> 
> Hi,
> 
> CONFIG_PAGE_TABLE_CHECK is pretty explicit about what it checks (and
> doesn't mention the zero page), but this condition seems to fit with the
> general category of "pages mapped wrongly to userspace". I added it
> locally to help me debug something. Maybe it's more widely useful.
> 
>   mm/page_table_check.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/mm/page_table_check.c b/mm/page_table_check.c
> index e2062748791a..665ece0d55d4 100644
> --- a/mm/page_table_check.c
> +++ b/mm/page_table_check.c
> @@ -102,6 +102,8 @@ static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
>   	if (!pfn_valid(pfn))
>   		return;
>   
> +	BUG_ON(is_zero_pfn(pfn) && rw);
> +

Why we need use BUG_ON() here? Based on [1], we should avoid to use the 
BUG_ON() due to it will panic the machine.

[1]: https://lore.kernel.org/lkml/20220824163100.224449-1-david@redhat.com/

>   	page = pfn_to_page(pfn);
>   	page_ext = lookup_page_ext(page);
>   	anon = PageAnon(page);
> 
> base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
Edgecombe, Rick P Sept. 5, 2022, 6:50 p.m. UTC | #2
On Sat, 2022-09-03 at 10:13 +0800, Huang, Shaoqin wrote:
> > +     BUG_ON(is_zero_pfn(pfn) && rw);
> > +
> 
> Why we need use BUG_ON() here? Based on [1], we should avoid to use
> the 
> BUG_ON() due to it will panic the machine.
> 
> [1]: 
> https://lore.kernel.org/lkml/20220824163100.224449-1-david@redhat.com/

Yea, you are probably right. All the rest of this checker uses BUG_ON()
though. Maybe they should all be something else? Just felt weird to
have this be the only check that is different.

I don't have any objections to changing it to WARN_ON(). Should I
switch the rest of the checks here while I'm at it?
Huang, Shaoqin Sept. 6, 2022, 12:24 a.m. UTC | #3
On 9/6/2022 2:50 AM, Edgecombe, Rick P wrote:
> On Sat, 2022-09-03 at 10:13 +0800, Huang, Shaoqin wrote:
>>> +     BUG_ON(is_zero_pfn(pfn) && rw);
>>> +
>>
>> Why we need use BUG_ON() here? Based on [1], we should avoid to use
>> the
>> BUG_ON() due to it will panic the machine.
>>
>> [1]:
>> https://lore.kernel.org/lkml/20220824163100.224449-1-david@redhat.com/
> 
> Yea, you are probably right. All the rest of this checker uses BUG_ON()
> though. Maybe they should all be something else? Just felt weird to
> have this be the only check that is different.
> 
> I don't have any objections to changing it to WARN_ON(). Should I
> switch the rest of the checks here while I'm at it?

Yes. I think in most situation, WARN_ON() or WARN_ON_ONCE() is ok.
Pasha Tatashin Sept. 6, 2022, 12:37 a.m. UTC | #4
Hi Shaoqin,

The idea behind page table check is to prevent some types of memory
corruptions: i.e. prevent false page sharing, and memory leaking
between address spaces. This is an optional security feature for
setups where it is more dangerous to leak data than to crash the
machine. Therefore, when page table check detects illegal page sharing
it immediately crashes the kernel. I think we can have a
page_table_check option that would change BUG_ON to WARN_ON() (or to
WARN_ON_ONCE(), since once corruption is detected I believe it might
show up many times again)

Pasha

On Fri, Sep 2, 2022 at 10:13 PM Huang, Shaoqin <shaoqin.huang@intel.com> wrote:
>
>
>
> On 9/3/2022 7:27 AM, Rick Edgecombe wrote:
> > The zero page should remain all zero, so that it can be mapped as
> > read-only for read faults of memory that should be zeroed. If it is ever
> > mapped writable to userspace, it could become non-zero and so other apps
> > would unexpectedly get non-zero data. So the zero page should never be
> > mapped writable to userspace. Check for this condition in
> > page_table_check_set().
> >
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> >
> > ---
> >
> > Hi,
> >
> > CONFIG_PAGE_TABLE_CHECK is pretty explicit about what it checks (and
> > doesn't mention the zero page), but this condition seems to fit with the
> > general category of "pages mapped wrongly to userspace". I added it
> > locally to help me debug something. Maybe it's more widely useful.
> >
> >   mm/page_table_check.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/page_table_check.c b/mm/page_table_check.c
> > index e2062748791a..665ece0d55d4 100644
> > --- a/mm/page_table_check.c
> > +++ b/mm/page_table_check.c
> > @@ -102,6 +102,8 @@ static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
> >       if (!pfn_valid(pfn))
> >               return;
> >
> > +     BUG_ON(is_zero_pfn(pfn) && rw);
> > +
>
> Why we need use BUG_ON() here? Based on [1], we should avoid to use the
> BUG_ON() due to it will panic the machine.
>
> [1]: https://lore.kernel.org/lkml/20220824163100.224449-1-david@redhat.com/
>
> >       page = pfn_to_page(pfn);
> >       page_ext = lookup_page_ext(page);
> >       anon = PageAnon(page);
> >
> > base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
Pasha Tatashin Sept. 6, 2022, 12:38 a.m. UTC | #5
Hi Rock,

Good idea to add a check for write access to zero page. Can you please
also update Documentation/mm/page_table_check.rst ?

Thank you,
Pasha

On Fri, Sep 2, 2022 at 7:31 PM Rick Edgecombe
<rick.p.edgecombe@intel.com> wrote:
>
> The zero page should remain all zero, so that it can be mapped as
> read-only for read faults of memory that should be zeroed. If it is ever
> mapped writable to userspace, it could become non-zero and so other apps
> would unexpectedly get non-zero data. So the zero page should never be
> mapped writable to userspace. Check for this condition in
> page_table_check_set().
>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
>
> ---
>
> Hi,
>
> CONFIG_PAGE_TABLE_CHECK is pretty explicit about what it checks (and
> doesn't mention the zero page), but this condition seems to fit with the
> general category of "pages mapped wrongly to userspace". I added it
> locally to help me debug something. Maybe it's more widely useful.
>
>  mm/page_table_check.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/mm/page_table_check.c b/mm/page_table_check.c
> index e2062748791a..665ece0d55d4 100644
> --- a/mm/page_table_check.c
> +++ b/mm/page_table_check.c
> @@ -102,6 +102,8 @@ static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
>         if (!pfn_valid(pfn))
>                 return;
>
> +       BUG_ON(is_zero_pfn(pfn) && rw);
> +
>         page = pfn_to_page(pfn);
>         page_ext = lookup_page_ext(page);
>         anon = PageAnon(page);
>
> base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
> --
> 2.17.1
>
Pasha Tatashin Sept. 6, 2022, 12:39 a.m. UTC | #6
Sorry: s/Rock/Rick

:-)

On Mon, Sep 5, 2022 at 8:38 PM Pasha Tatashin <pasha.tatashin@soleen.com> wrote:
>
> Hi Rock,
>
> Good idea to add a check for write access to zero page. Can you please
> also update Documentation/mm/page_table_check.rst ?
>
> Thank you,
> Pasha
>
> On Fri, Sep 2, 2022 at 7:31 PM Rick Edgecombe
> <rick.p.edgecombe@intel.com> wrote:
> >
> > The zero page should remain all zero, so that it can be mapped as
> > read-only for read faults of memory that should be zeroed. If it is ever
> > mapped writable to userspace, it could become non-zero and so other apps
> > would unexpectedly get non-zero data. So the zero page should never be
> > mapped writable to userspace. Check for this condition in
> > page_table_check_set().
> >
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> >
> > ---
> >
> > Hi,
> >
> > CONFIG_PAGE_TABLE_CHECK is pretty explicit about what it checks (and
> > doesn't mention the zero page), but this condition seems to fit with the
> > general category of "pages mapped wrongly to userspace". I added it
> > locally to help me debug something. Maybe it's more widely useful.
> >
> >  mm/page_table_check.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/page_table_check.c b/mm/page_table_check.c
> > index e2062748791a..665ece0d55d4 100644
> > --- a/mm/page_table_check.c
> > +++ b/mm/page_table_check.c
> > @@ -102,6 +102,8 @@ static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
> >         if (!pfn_valid(pfn))
> >                 return;
> >
> > +       BUG_ON(is_zero_pfn(pfn) && rw);
> > +
> >         page = pfn_to_page(pfn);
> >         page_ext = lookup_page_ext(page);
> >         anon = PageAnon(page);
> >
> > base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
> > --
> > 2.17.1
> >
John Hubbard Sept. 6, 2022, 12:49 a.m. UTC | #7
On 9/5/22 17:24, Huang, Shaoqin wrote:
> 
> 
> On 9/6/2022 2:50 AM, Edgecombe, Rick P wrote:
>> On Sat, 2022-09-03 at 10:13 +0800, Huang, Shaoqin wrote:
>>>> +     BUG_ON(is_zero_pfn(pfn) && rw);
>>>> +
>>>
>>> Why we need use BUG_ON() here? Based on [1], we should avoid to use
>>> the
>>> BUG_ON() due to it will panic the machine.
>>>
>>> [1]:
>>> https://lore.kernel.org/lkml/20220824163100.224449-1-david@redhat.com/
>>
>> Yea, you are probably right. All the rest of this checker uses BUG_ON()
>> though. Maybe they should all be something else? Just felt weird to
>> have this be the only check that is different.
>>
>> I don't have any objections to changing it to WARN_ON(). Should I
>> switch the rest of the checks here while I'm at it?
> 
> Yes. I think in most situation, WARN_ON() or WARN_ON_ONCE() is ok.
> 

+1 for WARN_ON_ONCE(), and also for changing the other cases as you find
them, yes.


thanks,
Huang, Shaoqin Sept. 6, 2022, 1:01 a.m. UTC | #8
On 9/6/2022 8:37 AM, Pasha Tatashin wrote:
> Hi Shaoqin,
> 
> The idea behind page table check is to prevent some types of memory
> corruptions: i.e. prevent false page sharing, and memory leaking
> between address spaces. This is an optional security feature for
> setups where it is more dangerous to leak data than to crash the
> machine. Therefore, when page table check detects illegal page sharing
> it immediately crashes the kernel. I think we can have a
> page_table_check option that would change BUG_ON to WARN_ON() (or to
> WARN_ON_ONCE(), since once corruption is detected I believe it might
> show up many times again)
Hi, Pasha,

Thanks for your explanation. That's make sense.

> 
> Pasha
> 
> On Fri, Sep 2, 2022 at 10:13 PM Huang, Shaoqin <shaoqin.huang@intel.com> wrote:
>>
>>
>>
>> On 9/3/2022 7:27 AM, Rick Edgecombe wrote:
>>> The zero page should remain all zero, so that it can be mapped as
>>> read-only for read faults of memory that should be zeroed. If it is ever
>>> mapped writable to userspace, it could become non-zero and so other apps
>>> would unexpectedly get non-zero data. So the zero page should never be
>>> mapped writable to userspace. Check for this condition in
>>> page_table_check_set().
>>>
>>> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
>>>
>>> ---
>>>
>>> Hi,
>>>
>>> CONFIG_PAGE_TABLE_CHECK is pretty explicit about what it checks (and
>>> doesn't mention the zero page), but this condition seems to fit with the
>>> general category of "pages mapped wrongly to userspace". I added it
>>> locally to help me debug something. Maybe it's more widely useful >>>
>>>    mm/page_table_check.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/mm/page_table_check.c b/mm/page_table_check.c
>>> index e2062748791a..665ece0d55d4 100644
>>> --- a/mm/page_table_check.c
>>> +++ b/mm/page_table_check.c
>>> @@ -102,6 +102,8 @@ static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
>>>        if (!pfn_valid(pfn))
>>>                return;
>>>
>>> +     BUG_ON(is_zero_pfn(pfn) && rw);
>>> +
>>
>> Why we need use BUG_ON() here? Based on [1], we should avoid to use the
>> BUG_ON() due to it will panic the machine.
>>
>> [1]: https://lore.kernel.org/lkml/20220824163100.224449-1-david@redhat.com/
>>
>>>        page = pfn_to_page(pfn);
>>>        page_ext = lookup_page_ext(page);
>>>        anon = PageAnon(page);
>>>
>>> base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
>
Edgecombe, Rick P Sept. 6, 2022, 4:42 p.m. UTC | #9
On Mon, 2022-09-05 at 20:37 -0400, Pasha Tatashin wrote:
> The idea behind page table check is to prevent some types of memory
> corruptions: i.e. prevent false page sharing, and memory leaking
> between address spaces. This is an optional security feature for
> setups where it is more dangerous to leak data than to crash the
> machine. Therefore, when page table check detects illegal page
> sharing
> it immediately crashes the kernel. I think we can have a
> page_table_check option that would change BUG_ON to WARN_ON() (or to
> WARN_ON_ONCE(), since once corruption is detected I believe it might
> show up many times again)

Do you think there are a lot of people that would want to set page
table check to BUG_ON mode, that wouldn't already be setting
panic_on_warn?

I didn't realize page table check was meant to be a security feature as
well. I thought it was more of a debug time checker.

Looking through more related discussions, there seems to be a strong
aversion to "crash the kernel" features. Especially if they are meant
to run in a non-testing context.
diff mbox series

Patch

diff --git a/mm/page_table_check.c b/mm/page_table_check.c
index e2062748791a..665ece0d55d4 100644
--- a/mm/page_table_check.c
+++ b/mm/page_table_check.c
@@ -102,6 +102,8 @@  static void page_table_check_set(struct mm_struct *mm, unsigned long addr,
 	if (!pfn_valid(pfn))
 		return;
 
+	BUG_ON(is_zero_pfn(pfn) && rw);
+
 	page = pfn_to_page(pfn);
 	page_ext = lookup_page_ext(page);
 	anon = PageAnon(page);