diff mbox series

mm: cachestat: fix two shmem bugs

Message ID 20240315095556.GC581298@cmpxchg.org (mailing list archive)
State New
Headers show
Series mm: cachestat: fix two shmem bugs | expand

Commit Message

Johannes Weiner March 15, 2024, 9:55 a.m. UTC
When cachestat on shmem races with swapping and invalidation, there
are two possible bugs:

1) A swapin error can have resulted in a poisoned swap entry in the
   shmem inode's xarray. Calling get_shadow_from_swap_cache() on it
   will result in an out-of-bounds access to swapper_spaces[].

   Validate the entry with non_swap_entry() before going further.

2) When we find a valid swap entry in the shmem's inode, the shadow
   entry in the swapcache might not exist yet: swap IO is still in
   progress and we're before __remove_mapping; swapin, invalidation,
   or swapoff have removed the shadow from swapcache after we saw the
   shmem swap entry.

   This will send a NULL to workingset_test_recent(). The latter
   purely operates on pointer bits, so it won't crash - node 0, memcg
   ID 0, eviction timestamp 0, etc. are all valid inputs - but it's a
   bogus test. In theory that could result in a false "recently
   evicted" count.

   Such a false positive wouldn't be the end of the world. But for
   code clarity and (future) robustness, be explicit about this case.

   Bail on get_shadow_from_swap_cache() returning NULL.

Fixes: cf264e1329fb ("cachestat: implement cachestat syscall")
Cc: stable@vger.kernel.org				[v6.5+]
Reported-by: Chengming Zhou <chengming.zhou@linux.dev>	[Bug #1]
Reported-by: Jann Horn <jannh@google.com>		[Bug #2]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/filemap.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Chengming Zhou March 15, 2024, 10:43 a.m. UTC | #1
On 2024/3/15 17:55, Johannes Weiner wrote:
> When cachestat on shmem races with swapping and invalidation, there
> are two possible bugs:
> 
> 1) A swapin error can have resulted in a poisoned swap entry in the
>    shmem inode's xarray. Calling get_shadow_from_swap_cache() on it
>    will result in an out-of-bounds access to swapper_spaces[].
> 
>    Validate the entry with non_swap_entry() before going further.
> 
> 2) When we find a valid swap entry in the shmem's inode, the shadow
>    entry in the swapcache might not exist yet: swap IO is still in
>    progress and we're before __remove_mapping; swapin, invalidation,
>    or swapoff have removed the shadow from swapcache after we saw the
>    shmem swap entry.
> 
>    This will send a NULL to workingset_test_recent(). The latter
>    purely operates on pointer bits, so it won't crash - node 0, memcg
>    ID 0, eviction timestamp 0, etc. are all valid inputs - but it's a
>    bogus test. In theory that could result in a false "recently
>    evicted" count.
> 
>    Such a false positive wouldn't be the end of the world. But for
>    code clarity and (future) robustness, be explicit about this case.
> 
>    Bail on get_shadow_from_swap_cache() returning NULL.
> 
> Fixes: cf264e1329fb ("cachestat: implement cachestat syscall")
> Cc: stable@vger.kernel.org				[v6.5+]
> Reported-by: Chengming Zhou <chengming.zhou@linux.dev>	[Bug #1]
> Reported-by: Jann Horn <jannh@google.com>		[Bug #2]
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Looks good to me.

Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>

Thanks.

> ---
>  mm/filemap.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 222adac7c9c5..0aa91bf6c1f7 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -4198,7 +4198,23 @@ static void filemap_cachestat(struct address_space *mapping,
>  				/* shmem file - in swap cache */
>  				swp_entry_t swp = radix_to_swp_entry(folio);
>  
> +				/* swapin error results in poisoned entry */
> +				if (non_swap_entry(swp))
> +					goto resched;
> +
> +				/*
> +				 * Getting a swap entry from the shmem
> +				 * inode means we beat
> +				 * shmem_unuse(). rcu_read_lock()
> +				 * ensures swapoff waits for us before
> +				 * freeing the swapper space. However,
> +				 * we can race with swapping and
> +				 * invalidation, so there might not be
> +				 * a shadow in the swapcache (yet).
> +				 */
>  				shadow = get_shadow_from_swap_cache(swp);
> +				if (!shadow)
> +					goto resched;
>  			}
>  #endif
>  			if (workingset_test_recent(shadow, true, &workingset))
Nhat Pham March 16, 2024, 2:41 a.m. UTC | #2
On Fri, Mar 15, 2024 at 4:55 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> When cachestat on shmem races with swapping and invalidation, there
> are two possible bugs:
>
> 1) A swapin error can have resulted in a poisoned swap entry in the
>    shmem inode's xarray. Calling get_shadow_from_swap_cache() on it
>    will result in an out-of-bounds access to swapper_spaces[].
>
>    Validate the entry with non_swap_entry() before going further.
>
> 2) When we find a valid swap entry in the shmem's inode, the shadow
>    entry in the swapcache might not exist yet: swap IO is still in
>    progress and we're before __remove_mapping; swapin, invalidation,
>    or swapoff have removed the shadow from swapcache after we saw the
>    shmem swap entry.
>
>    This will send a NULL to workingset_test_recent(). The latter
>    purely operates on pointer bits, so it won't crash - node 0, memcg
>    ID 0, eviction timestamp 0, etc. are all valid inputs - but it's a
>    bogus test. In theory that could result in a false "recently
>    evicted" count.
>
>    Such a false positive wouldn't be the end of the world. But for
>    code clarity and (future) robustness, be explicit about this case.
>
>    Bail on get_shadow_from_swap_cache() returning NULL.
>
> Fixes: cf264e1329fb ("cachestat: implement cachestat syscall")
> Cc: stable@vger.kernel.org                              [v6.5+]
> Reported-by: Chengming Zhou <chengming.zhou@linux.dev>  [Bug #1]
> Reported-by: Jann Horn <jannh@google.com>               [Bug #2]
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Nice catch! Thanks for the report, Chengming and Jann, and thanks for
the fix, Johannes!
Reviewed-by: Nhat Pham <nphamcs@gmail.com>

> ---
>  mm/filemap.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 222adac7c9c5..0aa91bf6c1f7 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -4198,7 +4198,23 @@ static void filemap_cachestat(struct address_space *mapping,
>                                 /* shmem file - in swap cache */
>                                 swp_entry_t swp = radix_to_swp_entry(folio);
>
> +                               /* swapin error results in poisoned entry */
> +                               if (non_swap_entry(swp))
> +                                       goto resched;
> +
> +                               /*
> +                                * Getting a swap entry from the shmem
> +                                * inode means we beat
> +                                * shmem_unuse(). rcu_read_lock()
> +                                * ensures swapoff waits for us before
> +                                * freeing the swapper space. However,
> +                                * we can race with swapping and
> +                                * invalidation, so there might not be
> +                                * a shadow in the swapcache (yet).
> +                                */
>                                 shadow = get_shadow_from_swap_cache(swp);
> +                               if (!shadow)
> +                                       goto resched;
>                         }
>  #endif
>                         if (workingset_test_recent(shadow, true, &workingset))
> --
> 2.44.0
>
Nhat Pham March 16, 2024, 4:30 a.m. UTC | #3

diff mbox series

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index 222adac7c9c5..0aa91bf6c1f7 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -4198,7 +4198,23 @@  static void filemap_cachestat(struct address_space *mapping,
 				/* shmem file - in swap cache */
 				swp_entry_t swp = radix_to_swp_entry(folio);
 
+				/* swapin error results in poisoned entry */
+				if (non_swap_entry(swp))
+					goto resched;
+
+				/*
+				 * Getting a swap entry from the shmem
+				 * inode means we beat
+				 * shmem_unuse(). rcu_read_lock()
+				 * ensures swapoff waits for us before
+				 * freeing the swapper space. However,
+				 * we can race with swapping and
+				 * invalidation, so there might not be
+				 * a shadow in the swapcache (yet).
+				 */
 				shadow = get_shadow_from_swap_cache(swp);
+				if (!shadow)
+					goto resched;
 			}
 #endif
 			if (workingset_test_recent(shadow, true, &workingset))