diff mbox series

[v2,2/2] mm: wrap __find_buddy_pfn() with a necessary buddy page validation.

Message ID 20220401181109.1477354-2-zi.yan@sent.com (mailing list archive)
State New
Headers show
Series [v2,1/2] mm: page_alloc: simplify pageblock migratetype check in __free_one_page(). | expand

Commit Message

Zi Yan April 1, 2022, 6:11 p.m. UTC
From: Zi Yan <ziy@nvidia.com>

Whenever the buddy of a page is found from __find_buddy_pfn(),
page_is_buddy() should be used to check its validity. Add a helper
function find_buddy_page_pfn() to find the buddy page and do the check
together.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/linux-mm/CAHk-=wji_AmYygZMTsPMdJ7XksMt7kOur8oDfDdniBRMjm4VkQ@mail.gmail.com/
Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/internal.h       | 23 +----------------
 mm/page_alloc.c     | 63 ++++++++++++++++++++++++++++++++++++++-------
 mm/page_isolation.c |  8 ++----
 3 files changed, 56 insertions(+), 38 deletions(-)

Comments

Linus Torvalds April 1, 2022, 6:29 p.m. UTC | #1
On Fri, Apr 1, 2022 at 11:11 AM Zi Yan <zi.yan@sent.com> wrote:
>
> Whenever the buddy of a page is found from __find_buddy_pfn(),
> page_is_buddy() should be used to check its validity. Add a helper
> function find_buddy_page_pfn() to find the buddy page and do the check
> together.

Well, this certainly looks nicer, except now:

> +extern struct page *find_buddy_page_pfn(struct page *page, unsigned int order);

that 'pfn' no longer makes sense in the name, and

> @@ -1075,11 +1118,11 @@ static inline void __free_one_page(struct page *page,
>                                                                 migratetype);
>                         return;
>                 }
> -               buddy_pfn = __find_buddy_pfn(pfn, order);
> -               buddy = page + (buddy_pfn - pfn);
>
> -               if (!page_is_buddy(page, buddy, order))
> +               buddy = find_buddy_page_pfn(page, order);
> +               if (!buddy)
>                         goto done_merging;
> +               buddy_pfn = page_to_pfn(buddy);

This case now does two "page_to_pfn()" calls (one inside
find_buddy_page_pfn(), and one explicitly on the buddy).

And those page_to_pfn() things can actually be fairly expensive. It
*looks* like just a subtraction, but it's a pointer subtraction that
can end up generating a divide by a non-power-of-two size.

NORMALLY we try very hard to make 'sizeof struct page' be exactly 8
words, but I do not believe this is actually guaranteed.

And yeah, the divide-by-a-constant can be turned into a multiply, but
even that is not necessarily always cheap.

Now, two out of three use-cases didn't actually want the buddy_pfn(),
but this one use-case does look like it might be performance-critical
and a problem.

                 Linus
Zi Yan April 1, 2022, 6:56 p.m. UTC | #2
On 1 Apr 2022, at 14:29, Linus Torvalds wrote:

> On Fri, Apr 1, 2022 at 11:11 AM Zi Yan <zi.yan@sent.com> wrote:
>>
>> Whenever the buddy of a page is found from __find_buddy_pfn(),
>> page_is_buddy() should be used to check its validity. Add a helper
>> function find_buddy_page_pfn() to find the buddy page and do the check
>> together.
>
> Well, this certainly looks nicer, except now:
>
>> +extern struct page *find_buddy_page_pfn(struct page *page, unsigned int order);
>
> that 'pfn' no longer makes sense in the name, and
>
>> @@ -1075,11 +1118,11 @@ static inline void __free_one_page(struct page *page,
>>                                                                 migratetype);
>>                         return;
>>                 }
>> -               buddy_pfn = __find_buddy_pfn(pfn, order);
>> -               buddy = page + (buddy_pfn - pfn);
>>
>> -               if (!page_is_buddy(page, buddy, order))
>> +               buddy = find_buddy_page_pfn(page, order);
>> +               if (!buddy)
>>                         goto done_merging;
>> +               buddy_pfn = page_to_pfn(buddy);
>
> This case now does two "page_to_pfn()" calls (one inside
> find_buddy_page_pfn(), and one explicitly on the buddy).
>
> And those page_to_pfn() things can actually be fairly expensive. It
> *looks* like just a subtraction, but it's a pointer subtraction that
> can end up generating a divide by a non-power-of-two size.
>
> NORMALLY we try very hard to make 'sizeof struct page' be exactly 8
> words, but I do not believe this is actually guaranteed.
>
> And yeah, the divide-by-a-constant can be turned into a multiply, but
> even that is not necessarily always cheap.
>
> Now, two out of three use-cases didn't actually want the buddy_pfn(),
> but this one use-case does look like it might be performance-critical
> and a problem.

Yeah, I should have listened to your initial suggestion. Thanks for the
explanation.

How about the patch below? If it looks good, I will send v3.

diff --git a/mm/internal.h b/mm/internal.h
index 876e66237c89..754a666e5785 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -211,28 +211,8 @@ struct alloc_context {
 	bool spread_dirty_pages;
 };

-/*
- * Locate the struct page for both the matching buddy in our
- * pair (buddy1) and the combined O(n+1) page they form (page).
- *
- * 1) Any buddy B1 will have an order O twin B2 which satisfies
- * the following equation:
- *     B2 = B1 ^ (1 << O)
- * For example, if the starting buddy (buddy2) is #8 its order
- * 1 buddy is #10:
- *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
- *
- * 2) Any buddy B will have an order O+1 parent P which
- * satisfies the following equation:
- *     P = B & ~(1 << O)
- *
- * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
- */
-static inline unsigned long
-__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
-{
-	return page_pfn ^ (1 << order);
-}
+extern struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
+				unsigned int order, unsigned long *buddy_pfn);

 extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
 				unsigned long end_pfn, struct zone *zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2ea106146686..8195b4f64ec5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -998,6 +998,57 @@ static inline void del_page_from_free_list(struct page *page, struct zone *zone,
 	zone->free_area[order].nr_free--;
 }

+/*
+ * Locate the struct page for both the matching buddy in our
+ * pair (buddy1) and the combined O(n+1) page they form (page).
+ *
+ * 1) Any buddy B1 will have an order O twin B2 which satisfies
+ * the following equation:
+ *     B2 = B1 ^ (1 << O)
+ * For example, if the starting buddy (buddy2) is #8 its order
+ * 1 buddy is #10:
+ *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
+ *
+ * 2) Any buddy B will have an order O+1 parent P which
+ * satisfies the following equation:
+ *     P = B & ~(1 << O)
+ *
+ * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
+ */
+static inline unsigned long
+__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
+{
+	return page_pfn ^ (1 << order);
+}
+
+/*
+ * Find the buddy of @page and validate it.
+ * @page: The input page
+ * @pfn: The pfn of the page, it saves a call to page_to_pfn() when the
+ *       function is used in the performance-critical __free_one_page().
+ * @order: The order of the page
+ * @buddy_pfn: The output pointer to the buddy pfn, it also saves a call to
+ *             page_to_pfn().
+ *
+ * The found buddy can be a non PageBuddy, out of @page's zone, or its order is
+ * not the same as @page. The validation is necessary before use it.
+ *
+ * Return: the found buddy page or NULL if not found.
+ */
+struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
+			unsigned int order, unsigned long *buddy_pfn)
+{
+	struct page *buddy;
+
+	*buddy_pfn = __find_buddy_pfn(pfn, order);
+	buddy = page + (*buddy_pfn - pfn);
+
+	if (page_is_buddy(page, buddy, order))
+		return buddy;
+	return NULL;
+}
+
 /*
  * If this is not the largest possible page, check if the buddy
  * of the next-highest order is free. If it is, it's possible
@@ -1010,18 +1061,17 @@ static inline bool
 buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
 		   struct page *page, unsigned int order)
 {
-	struct page *higher_page, *higher_buddy;
-	unsigned long combined_pfn;
+	struct page *higher_page;
+	unsigned long higher_page_pfn;

 	if (order >= MAX_ORDER - 2)
 		return false;

-	combined_pfn = buddy_pfn & pfn;
-	higher_page = page + (combined_pfn - pfn);
-	buddy_pfn = __find_buddy_pfn(combined_pfn, order + 1);
-	higher_buddy = higher_page + (buddy_pfn - combined_pfn);
+	higher_page_pfn = buddy_pfn & pfn;
+	higher_page = page + (higher_page_pfn - pfn);

-	return page_is_buddy(higher_page, higher_buddy, order + 1);
+	return find_buddy_page_pfn(higher_page, higher_page_pfn, order + 1,
+			&buddy_pfn) != NULL;
 }

 /*
@@ -1075,10 +1125,9 @@ static inline void __free_one_page(struct page *page,
 								migratetype);
 			return;
 		}
-		buddy_pfn = __find_buddy_pfn(pfn, order);
-		buddy = page + (buddy_pfn - pfn);

-		if (!page_is_buddy(page, buddy, order))
+		buddy = find_buddy_page_pfn(page, pfn, order, &buddy_pfn);
+		if (!buddy)
 			goto done_merging;

 		if (unlikely(order >= pageblock_order)) {
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index f67c4c70f17f..4fbd27ba91c6 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -70,7 +70,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 	unsigned long flags, nr_pages;
 	bool isolated_page = false;
 	unsigned int order;
-	unsigned long pfn, buddy_pfn;
+	unsigned long buddy_pfn;
 	struct page *buddy;

 	zone = page_zone(page);
@@ -89,11 +89,9 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 	if (PageBuddy(page)) {
 		order = buddy_order(page);
 		if (order >= pageblock_order && order < MAX_ORDER - 1) {
-			pfn = page_to_pfn(page);
-			buddy_pfn = __find_buddy_pfn(pfn, order);
-			buddy = page + (buddy_pfn - pfn);
-
-			if (!is_migrate_isolate_page(buddy)) {
+			buddy = find_buddy_page_pfn(page, page_to_pfn(page),
+						    order, &buddy_pfn);
+			if (buddy && !is_migrate_isolate_page(buddy)) {
 				isolated_page = !!__isolate_free_page(page, order);
 				/*
 				 * Isolating a free page in an isolated pageblock

--
Best Regards,
Yan, Zi
Linus Torvalds April 1, 2022, 7:01 p.m. UTC | #3
On Fri, Apr 1, 2022 at 11:56 AM Zi Yan <ziy@nvidia.com> wrote:
>
> How about the patch below? If it looks good, I will send v3.

I can't see anything worrisome, but by now I've looked at several
versions and who knows what I'm missing.

Making it inline and allowing a NULL 'buddy_pfn' pointer for the cases
that don't care might be an option, but I don't think it matters
hugely.

                   Linus
Zi Yan April 1, 2022, 7:25 p.m. UTC | #4
On 1 Apr 2022, at 15:01, Linus Torvalds wrote:

> On Fri, Apr 1, 2022 at 11:56 AM Zi Yan <ziy@nvidia.com> wrote:
>>
>> How about the patch below? If it looks good, I will send v3.
>
> I can't see anything worrisome, but by now I've looked at several
> versions and who knows what I'm missing.
>
> Making it inline and allowing a NULL 'buddy_pfn' pointer for the cases
> that don't care might be an option, but I don't think it matters
> hugely.

What do you mean by inlining it? Moving the function and __find_buddy_pfn()
to mm/internal.h and mark both inline?

Something like below to allow a NULL 'buddy_pfn'? buddy_pfn is needed
to store the result of __find_buddy_pfn(). The code does not look
as nice as before.

struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
                        unsigned int order, unsigned long *buddy_pfn)
{
        struct page *buddy;

        if (buddy_pfn) {
                *buddy_pfn = __find_buddy_pfn(pfn, order);
                buddy = page + (*buddy_pfn - pfn);
        } else
                buddy = page + (__find_buddy_pfn(pfn, order) - pfn);

        if (page_is_buddy(page, buddy, order))
                return buddy;
        return NULL;
}

or

struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
                        unsigned int order, unsigned long *buddy_pfn)
{
        struct page *buddy;
        unsigned long local_buddy_pfn = __find_buddy_pfn(pfn, order);

        buddy = page + (local_buddy_pfn - pfn);
        if (buddy_pfn)
                *buddy_pfn = local_buddy_pfn;

        if (page_is_buddy(page, buddy, order))
                return buddy;
        return NULL;
}

--
Best Regards,
Yan, Zi
Vlastimil Babka April 1, 2022, 10:33 p.m. UTC | #5
On 4/1/22 21:25, Zi Yan wrote:
> On 1 Apr 2022, at 15:01, Linus Torvalds wrote:
> 
>> On Fri, Apr 1, 2022 at 11:56 AM Zi Yan <ziy@nvidia.com> wrote:
>>>
>>> How about the patch below? If it looks good, I will send v3.
>>
>> I can't see anything worrisome, but by now I've looked at several
>> versions and who knows what I'm missing.
>>
>> Making it inline and allowing a NULL 'buddy_pfn' pointer for the cases
>> that don't care might be an option, but I don't think it matters
>> hugely.
> 
> What do you mean by inlining it? Moving the function and __find_buddy_pfn()
> to mm/internal.h and mark both inline?

I would prefer that for the sake of __free_one_page().

> Something like below to allow a NULL 'buddy_pfn'? buddy_pfn is needed
> to store the result of __find_buddy_pfn(). The code does not look
> as nice as before.
> 
> struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
>                         unsigned int order, unsigned long *buddy_pfn)
> {
>         struct page *buddy;
> 
>         if (buddy_pfn) {
>                 *buddy_pfn = __find_buddy_pfn(pfn, order);
>                 buddy = page + (*buddy_pfn - pfn);
>         } else
>                 buddy = page + (__find_buddy_pfn(pfn, order) - pfn);
> 
>         if (page_is_buddy(page, buddy, order))
>                 return buddy;
>         return NULL;
> }
> 
> or
> 
> struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
>                         unsigned int order, unsigned long *buddy_pfn)
> {
>         struct page *buddy;
>         unsigned long local_buddy_pfn = __find_buddy_pfn(pfn, order);
> 
>         buddy = page + (local_buddy_pfn - pfn);
>         if (buddy_pfn)
>                 *buddy_pfn = local_buddy_pfn;
> 
>         if (page_is_buddy(page, buddy, order))
>                 return buddy;
>         return NULL;
> }

The latter looks better. I would also use a name e.g. '__buddy_pfn'
instead of 'local_'.

Thanks.

> 
> --
> Best Regards,
> Yan, Zi
diff mbox series

Patch

diff --git a/mm/internal.h b/mm/internal.h
index 876e66237c89..f43565ed5ff6 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -211,28 +211,7 @@  struct alloc_context {
 	bool spread_dirty_pages;
 };
 
-/*
- * Locate the struct page for both the matching buddy in our
- * pair (buddy1) and the combined O(n+1) page they form (page).
- *
- * 1) Any buddy B1 will have an order O twin B2 which satisfies
- * the following equation:
- *     B2 = B1 ^ (1 << O)
- * For example, if the starting buddy (buddy2) is #8 its order
- * 1 buddy is #10:
- *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
- *
- * 2) Any buddy B will have an order O+1 parent P which
- * satisfies the following equation:
- *     P = B & ~(1 << O)
- *
- * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
- */
-static inline unsigned long
-__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
-{
-	return page_pfn ^ (1 << order);
-}
+extern struct page *find_buddy_page_pfn(struct page *page, unsigned int order);
 
 extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
 				unsigned long end_pfn, struct zone *zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2ea106146686..8aedc6fbfdd0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -998,6 +998,51 @@  static inline void del_page_from_free_list(struct page *page, struct zone *zone,
 	zone->free_area[order].nr_free--;
 }
 
+/*
+ * Locate the struct page for both the matching buddy in our
+ * pair (buddy1) and the combined O(n+1) page they form (page).
+ *
+ * 1) Any buddy B1 will have an order O twin B2 which satisfies
+ * the following equation:
+ *     B2 = B1 ^ (1 << O)
+ * For example, if the starting buddy (buddy2) is #8 its order
+ * 1 buddy is #10:
+ *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
+ *
+ * 2) Any buddy B will have an order O+1 parent P which
+ * satisfies the following equation:
+ *     P = B & ~(1 << O)
+ *
+ * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
+ */
+static inline unsigned long
+__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
+{
+	return page_pfn ^ (1 << order);
+}
+
+
+/*
+ * Find the buddy of @page and validate it.
+ * @page: The input page
+ * @order: The order of the input page
+ *
+ * The found buddy can be a non PageBuddy, out of @page's zone, or its order is
+ * not the same as @page. The validation is necessary before use it.
+ *
+ * Return: the found buddy page or NULL if not found.
+ */
+struct page *find_buddy_page_pfn(struct page *page, unsigned int order)
+{
+	unsigned long pfn = page_to_pfn(page);
+	unsigned long buddy_pfn = __find_buddy_pfn(pfn, order);
+	struct page *buddy = page + (buddy_pfn - pfn);
+
+	if (page_is_buddy(page, buddy, order))
+		return buddy;
+	return NULL;
+}
+
 /*
  * If this is not the largest possible page, check if the buddy
  * of the next-highest order is free. If it is, it's possible
@@ -1010,18 +1055,16 @@  static inline bool
 buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
 		   struct page *page, unsigned int order)
 {
-	struct page *higher_page, *higher_buddy;
-	unsigned long combined_pfn;
+	struct page *higher_page;
+	unsigned long higher_page_pfn;
 
 	if (order >= MAX_ORDER - 2)
 		return false;
 
-	combined_pfn = buddy_pfn & pfn;
-	higher_page = page + (combined_pfn - pfn);
-	buddy_pfn = __find_buddy_pfn(combined_pfn, order + 1);
-	higher_buddy = higher_page + (buddy_pfn - combined_pfn);
+	higher_page_pfn = buddy_pfn & pfn;
+	higher_page = page + (higher_page_pfn - pfn);
 
-	return page_is_buddy(higher_page, higher_buddy, order + 1);
+	return find_buddy_page_pfn(higher_page, order + 1) != NULL;
 }
 
 /*
@@ -1075,11 +1118,11 @@  static inline void __free_one_page(struct page *page,
 								migratetype);
 			return;
 		}
-		buddy_pfn = __find_buddy_pfn(pfn, order);
-		buddy = page + (buddy_pfn - pfn);
 
-		if (!page_is_buddy(page, buddy, order))
+		buddy = find_buddy_page_pfn(page, order);
+		if (!buddy)
 			goto done_merging;
+		buddy_pfn = page_to_pfn(buddy);
 
 		if (unlikely(order >= pageblock_order)) {
 			/*
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index f67c4c70f17f..7326c9f57d55 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -70,7 +70,6 @@  static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 	unsigned long flags, nr_pages;
 	bool isolated_page = false;
 	unsigned int order;
-	unsigned long pfn, buddy_pfn;
 	struct page *buddy;
 
 	zone = page_zone(page);
@@ -89,11 +88,8 @@  static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 	if (PageBuddy(page)) {
 		order = buddy_order(page);
 		if (order >= pageblock_order && order < MAX_ORDER - 1) {
-			pfn = page_to_pfn(page);
-			buddy_pfn = __find_buddy_pfn(pfn, order);
-			buddy = page + (buddy_pfn - pfn);
-
-			if (!is_migrate_isolate_page(buddy)) {
+			buddy = find_buddy_page_pfn(page, order);
+			if (buddy && !is_migrate_isolate_page(buddy)) {
 				isolated_page = !!__isolate_free_page(page, order);
 				/*
 				 * Isolating a free page in an isolated pageblock