Message ID | 20250402201841.3245371-3-dwmw2@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC,1/3] mm: Introduce for_each_valid_pfn() and use it from reserve_bootmem_region() | expand |
On Wed, Apr 02, 2025 at 09:18:41PM +0100, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > Introduce a pfn_first_valid() helper which takes a pointer to the PFN and > updates it to point to the first valid PFN starting from that point, and > returns true if a valid PFN was found. > > This largely mirrors pfn_valid(), calling into a pfn_section_first_valid() > helper which is trivial for the !CONFIG_SPARSEMEM_VMEMMAP case, and in > the VMEMMAP case will skip to the next subsection as needed. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> with a small nit below > +static inline bool first_valid_pfn(unsigned long *p_pfn) > +{ > + unsigned long pfn = *p_pfn; > + unsigned long nr = pfn_to_section_nr(pfn); > + struct mem_section *ms; > + bool ret = false; > + > + ms = __pfn_to_section(pfn); > + > + rcu_read_lock_sched(); > + > + while (!ret && nr <= __highest_present_section_nr) { This could be just for(;;), we anyway break when ret becomes true or we get past last present section. > + if (valid_section(ms) && > + (early_section(ms) || pfn_section_first_valid(ms, &pfn))) { > + ret = true; > + break; > + } > + > + nr++; > + if (nr > __highest_present_section_nr) > + break; > + > + pfn = section_nr_to_pfn(nr); > + ms = __pfn_to_section(pfn); > + } > + > + rcu_read_unlock_sched(); > + > + *p_pfn = pfn; > + > + return ret; > +} > + > +#define for_each_valid_pfn(_pfn, _start_pfn, _end_pfn) \ > + for ((_pfn) = (_start_pfn); \ > + first_valid_pfn(&(_pfn)) && (_pfn) < (_end_pfn); \ > + (_pfn)++) > + > #endif > > static inline int pfn_in_present_section(unsigned long pfn) > -- > 2.49.0 >
On Thu, 2025-04-03 at 09:24 +0300, Mike Rapoport wrote: > On Wed, Apr 02, 2025 at 09:18:41PM +0100, David Woodhouse wrote: > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > Introduce a pfn_first_valid() helper which takes a pointer to the > > PFN and > > updates it to point to the first valid PFN starting from that > > point, and > > returns true if a valid PFN was found. > > > > This largely mirrors pfn_valid(), calling into a > > pfn_section_first_valid() > > helper which is trivial for the !CONFIG_SPARSEMEM_VMEMMAP case, and > > in > > the VMEMMAP case will skip to the next subsection as needed. > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Thanks. > with a small nit below > > > +static inline bool first_valid_pfn(unsigned long *p_pfn) > > +{ > > + unsigned long pfn = *p_pfn; > > + unsigned long nr = pfn_to_section_nr(pfn); > > + struct mem_section *ms; > > + bool ret = false; > > + > > + ms = __pfn_to_section(pfn); > > + > > + rcu_read_lock_sched(); > > + > > + while (!ret && nr <= __highest_present_section_nr) { > > This could be just for(;;), we anyway break when ret becomes true or we get > past last present section. True for the 'ret' part but not *nicely* for the last present section. If the original pfn is higher than the last present section it could trigger that check before entering the loop. Yes, in that case 'ms' will be NULL, valid_section(NULL) is false and you're right that it'll make it through to the check in the loop without crashing. So it would currently be harmless, but I didn't like it. It's relying on the loop not to do the wrong thing with an input which is arguably invalid. I'll see if I can make it neater. I may drop the 'ret' variable completely and just turn the match clause into unlock-and-return-true. I *like* having a single unlock site. But I think I like simpler loop code more than that. FWIW I think the check for (PHYS_PFN(PFN_PHYS(pfn)) != pfn) at the start of pfn_valid() a few lines above is similarly redundant. Because if the high bits are set in the PFN then pfn_to_section_nr(pfn) is surely going to be higher than NR_MEM_SECTIONS and it'll get thrown out at the very next check, won't it? I care because I didn't bother to duplicate that 'redundant' check in my first_valid_pfn(), so if there's a reason for it that I'm missing, I should take a closer look. I'm also missing the reason why the FLATMEM code in memory_model.h does 'unsigned long pfn_offset = ARCH_PFN_OFFSET' and then uses its local pfn_offset variable, instead of just using ARCH_PFN_OFFSET directly as I do in the FLATMEM for_each_valid_pfn() macro. > > + if (valid_section(ms) && > > + (early_section(ms) || pfn_section_first_valid(ms, &pfn))) { > > + ret = true; > > + break; > > + } > > + > > + nr++; > > + if (nr > __highest_present_section_nr) > > + break; > > + > > + pfn = section_nr_to_pfn(nr); > > + ms = __pfn_to_section(pfn); > > + } > > + > > + rcu_read_unlock_sched(); > > + > > + *p_pfn = pfn; > > + > > + return ret; > > +} > > + > > +#define for_each_valid_pfn(_pfn, _start_pfn, _end_pfn) \ > > + for ((_pfn) = (_start_pfn); \ > > + first_valid_pfn(&(_pfn)) && (_pfn) < (_end_pfn); \ > > + (_pfn)++) > > + > > #endif > > > > static inline int pfn_in_present_section(unsigned long pfn) > > -- > > 2.49.0 > > >
On Thu, 2025-04-03 at 08:07 +0100, David Woodhouse wrote: > > I'll see if I can make it neater. I may drop the 'ret' variable > completely and just turn the match clause into unlock-and-return-true. > I *like* having a single unlock site. But I think I like simpler loop > code more than that. That's better (IMO). And I note that pfn_valid() already doesn't follow the modern fetish for having only one unlock site even when it makes the surrounding code more complex to do so. static inline bool first_valid_pfn(unsigned long *p_pfn) { unsigned long pfn = *p_pfn; unsigned long nr = pfn_to_section_nr(pfn); struct mem_section *ms; rcu_read_lock_sched(); while (nr <= __highest_present_section_nr) { ms = __pfn_to_section(pfn); if (valid_section(ms) && (early_section(ms) || pfn_section_first_valid(ms, &pfn))) { *p_pfn = pfn; rcu_read_unlock_sched(); return true; } /* Nothing left in this section? Skip to next section */ nr++; pfn = section_nr_to_pfn(nr); } rcu_read_unlock_sched(); return false; }
On Thu, Apr 03, 2025 at 08:07:22AM +0100, David Woodhouse wrote: > On Thu, 2025-04-03 at 09:24 +0300, Mike Rapoport wrote: > > with a small nit below > > > > > +static inline bool first_valid_pfn(unsigned long *p_pfn) > > > +{ > > > + unsigned long pfn = *p_pfn; > > > + unsigned long nr = pfn_to_section_nr(pfn); > > > + struct mem_section *ms; > > > + bool ret = false; > > > + > > > + ms = __pfn_to_section(pfn); > > > + > > > + rcu_read_lock_sched(); > > > + > > > + while (!ret && nr <= __highest_present_section_nr) { > > > > This could be just for(;;), we anyway break when ret becomes true or we get > > past last present section. > > True for the 'ret' part but not *nicely* for the last present section. > If the original pfn is higher than the last present section it could > trigger that check before entering the loop. > > Yes, in that case 'ms' will be NULL, valid_section(NULL) is false and > you're right that it'll make it through to the check in the loop > without crashing. So it would currently be harmless, but I didn't like > it. It's relying on the loop not to do the wrong thing with an input > which is arguably invalid. > > I'll see if I can make it neater. I may drop the 'ret' variable > completely and just turn the match clause into unlock-and-return-true. > I *like* having a single unlock site. But I think I like simpler loop > code more than that. > > FWIW I think the check for (PHYS_PFN(PFN_PHYS(pfn)) != pfn) at the > start of pfn_valid() a few lines above is similarly redundant. Because > if the high bits are set in the PFN then pfn_to_section_nr(pfn) is > surely going to be higher than NR_MEM_SECTIONS and it'll get thrown out > at the very next check, won't it? I believe the check for (PHYS_PFN(PFN_PHYS(pfn)) != pfn) got to the generic version from arm64::pfn_valid() that historically supported both FLATMEM and SPARSEMEM. I can't think of a configuration in which (PHYS_PFN(PFN_PHYS(pfn)) != pfn) and pfn_to_section_nr(pfn) won't be higher than NR_MEM_SECTIONS, but with all variants that arm64 has for PAGE_SHIFT and ARM64_PA_BITS I could miss something. > I care because I didn't bother to duplicate that 'redundant' check in > my first_valid_pfn(), so if there's a reason for it that I'm missing, I > should take a closer look. > > I'm also missing the reason why the FLATMEM code in memory_model.h does > 'unsigned long pfn_offset = ARCH_PFN_OFFSET' and then uses its local > pfn_offset variable, instead of just using ARCH_PFN_OFFSET directly as > I do in the FLATMEM for_each_valid_pfn() macro. Don't remember now, but I surely had some $REASON for that :)
On Thu, Apr 03, 2025 at 08:15:41AM +0100, David Woodhouse wrote: > On Thu, 2025-04-03 at 08:07 +0100, David Woodhouse wrote: > > > > I'll see if I can make it neater. I may drop the 'ret' variable > > completely and just turn the match clause into unlock-and-return-true. > > I *like* having a single unlock site. But I think I like simpler loop > > code more than that. > > That's better (IMO). > > And I note that pfn_valid() already doesn't follow the modern fetish > for having only one unlock site even when it makes the surrounding code > more complex to do so. > > static inline bool first_valid_pfn(unsigned long *p_pfn) > { > unsigned long pfn = *p_pfn; > unsigned long nr = pfn_to_section_nr(pfn); > struct mem_section *ms; > > rcu_read_lock_sched(); > > while (nr <= __highest_present_section_nr) { > ms = __pfn_to_section(pfn); Maybe move the declaration here: struct mem_section *ms = __pfn_to_section(pfn); > > if (valid_section(ms) && > (early_section(ms) || pfn_section_first_valid(ms, &pfn))) { > *p_pfn = pfn; > rcu_read_unlock_sched(); > return true; > } > > /* Nothing left in this section? Skip to next section */ > nr++; > pfn = section_nr_to_pfn(nr); > } > > rcu_read_unlock_sched(); > > return false; > }
On Thu, 2025-04-03 at 17:13 +0300, Mike Rapoport wrote: > > > static inline bool first_valid_pfn(unsigned long *p_pfn) > > { > > unsigned long pfn = *p_pfn; > > unsigned long nr = pfn_to_section_nr(pfn); > > struct mem_section *ms; > > > > rcu_read_lock_sched(); > > > > while (nr <= __highest_present_section_nr) { > > ms = __pfn_to_section(pfn); > > Maybe move the declaration here: > > struct mem_section *ms = __pfn_to_section(pfn); Ack. https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/for_each_valid_pfn
On Thu, Apr 03, 2025 at 03:17:44PM +0100, David Woodhouse wrote: > On Thu, 2025-04-03 at 17:13 +0300, Mike Rapoport wrote: > > > > > static inline bool first_valid_pfn(unsigned long *p_pfn) > > > { > > > unsigned long pfn = *p_pfn; > > > unsigned long nr = pfn_to_section_nr(pfn); > > > struct mem_section *ms; > > > > > > rcu_read_lock_sched(); > > > > > > while (nr <= __highest_present_section_nr) { > > > ms = __pfn_to_section(pfn); > > > > Maybe move the declaration here: > > > > struct mem_section *ms = __pfn_to_section(pfn); > > Ack. > > https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/for_each_valid_pfn Fine with me, keep the RB tag :)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 32ecb5cadbaf..a389d1857b85 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -2074,11 +2074,37 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) return usage ? test_bit(idx, usage->subsection_map) : 0; } + +static inline bool pfn_section_first_valid(struct mem_section *ms, unsigned long *pfn) +{ + struct mem_section_usage *usage = READ_ONCE(ms->usage); + int idx = subsection_map_index(*pfn); + unsigned long bit; + + if (!usage) + return false; + + if (test_bit(idx, usage->subsection_map)) + return true; + + /* Find the next subsection that exists */ + bit = find_next_bit(usage->subsection_map, SUBSECTIONS_PER_SECTION, idx); + if (bit == SUBSECTIONS_PER_SECTION) + return false; + + *pfn = (*pfn & PAGE_SECTION_MASK) + (bit * PAGES_PER_SUBSECTION); + return true; +} #else static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) { return 1; } + +static inline bool pfn_section_first_valid(struct mem_section *ms, unsigned long *pfn) +{ + return true; +} #endif void sparse_init_early_section(int nid, struct page *map, unsigned long pnum, @@ -2127,6 +2153,45 @@ static inline int pfn_valid(unsigned long pfn) return ret; } + +static inline bool first_valid_pfn(unsigned long *p_pfn) +{ + unsigned long pfn = *p_pfn; + unsigned long nr = pfn_to_section_nr(pfn); + struct mem_section *ms; + bool ret = false; + + ms = __pfn_to_section(pfn); + + rcu_read_lock_sched(); + + while (!ret && nr <= __highest_present_section_nr) { + if (valid_section(ms) && + (early_section(ms) || pfn_section_first_valid(ms, &pfn))) { + ret = true; + break; + } + + nr++; + if (nr > __highest_present_section_nr) + break; + + pfn = section_nr_to_pfn(nr); + ms = __pfn_to_section(pfn); + } + + rcu_read_unlock_sched(); + + *p_pfn = pfn; + + return ret; +} + +#define for_each_valid_pfn(_pfn, _start_pfn, _end_pfn) \ + for ((_pfn) = (_start_pfn); \ + first_valid_pfn(&(_pfn)) && (_pfn) < (_end_pfn); \ + (_pfn)++) + #endif static inline int pfn_in_present_section(unsigned long pfn)