Message ID | 20180622162841.25114-1-mhocko@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 06/22/2018 06:28 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > There is no real reason to blow up just because the caller doesn't know > that __get_free_pages cannot return highmem pages. Simply fix that up > silently. Even if we have some confused users such a fixup will not be > harmful. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > --- > Hi Andrew, > previously posted [1] but it fell through cracks. Can we merge it now? > > [1] http://lkml.kernel.org/r/20171129160446.jluzpv3n6mjc3fwv@dhcp22.suse.cz > > mm/page_alloc.c | 10 +++------- > 1 file changed, 3 insertions(+), 7 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 1521100f1e63..5f56f662a52d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4402,18 +4402,14 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, > EXPORT_SYMBOL(__alloc_pages_nodemask); > > /* > - * Common helper functions. > + * Common helper functions. Never use with __GFP_HIGHMEM because the returned > + * address cannot represent highmem pages. Use alloc_pages and then kmap if > + * you need to access high mem. > */ > unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order) > { > struct page *page; > > - /* > - * __get_free_pages() returns a virtual address, which cannot represent > - * a highmem page > - */ > - VM_BUG_ON((gfp_mask & __GFP_HIGHMEM) != 0); > - > page = alloc_pages(gfp_mask, order); The previous version had also replaced the line above with: + page = alloc_pages(gfp_mask & ~__GFP_HIGHMEM, order); This one doesn't, yet you say "fix that up silently". Bug? > if (!page) > return 0; >
On Tue 26-06-18 15:57:39, Vlastimil Babka wrote: > On 06/22/2018 06:28 PM, Michal Hocko wrote: > > From: Michal Hocko <mhocko@suse.com> > > > > There is no real reason to blow up just because the caller doesn't know > > that __get_free_pages cannot return highmem pages. Simply fix that up > > silently. Even if we have some confused users such a fixup will not be > > harmful. > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > --- > > Hi Andrew, > > previously posted [1] but it fell through cracks. Can we merge it now? > > > > [1] http://lkml.kernel.org/r/20171129160446.jluzpv3n6mjc3fwv@dhcp22.suse.cz > > > > mm/page_alloc.c | 10 +++------- > > 1 file changed, 3 insertions(+), 7 deletions(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 1521100f1e63..5f56f662a52d 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -4402,18 +4402,14 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, > > EXPORT_SYMBOL(__alloc_pages_nodemask); > > > > /* > > - * Common helper functions. > > + * Common helper functions. Never use with __GFP_HIGHMEM because the returned > > + * address cannot represent highmem pages. Use alloc_pages and then kmap if > > + * you need to access high mem. > > */ > > unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order) > > { > > struct page *page; > > > > - /* > > - * __get_free_pages() returns a virtual address, which cannot represent > > - * a highmem page > > - */ > > - VM_BUG_ON((gfp_mask & __GFP_HIGHMEM) != 0); > > - > > page = alloc_pages(gfp_mask, order); > > The previous version had also replaced the line above with: > > + page = alloc_pages(gfp_mask & ~__GFP_HIGHMEM, order); > > This one doesn't, yet you say "fix that up silently". Bug? got lost somewhere on the way during the discussion. Thanks for spotting that. Andrew, could you add gfp_mask & ~__GFP_HIGHMEM please? Or should I resubmit?
On Tue, 26 Jun 2018 15:57:39 +0200 Vlastimil Babka <vbabka@suse.cz> wrote: > On 06/22/2018 06:28 PM, Michal Hocko wrote: > > From: Michal Hocko <mhocko@suse.com> > > > > There is no real reason to blow up just because the caller doesn't know > > that __get_free_pages cannot return highmem pages. Simply fix that up > > silently. Even if we have some confused users such a fixup will not be > > harmful. > > > > ... > > > /* > > - * Common helper functions. > > + * Common helper functions. Never use with __GFP_HIGHMEM because the returned > > + * address cannot represent highmem pages. Use alloc_pages and then kmap if > > + * you need to access high mem. > > */ > > unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order) > > { > > struct page *page; > > > > - /* > > - * __get_free_pages() returns a virtual address, which cannot represent > > - * a highmem page > > - */ > > - VM_BUG_ON((gfp_mask & __GFP_HIGHMEM) != 0); > > - > > page = alloc_pages(gfp_mask, order); > > The previous version had also replaced the line above with: > > + page = alloc_pages(gfp_mask & ~__GFP_HIGHMEM, order); > > This one doesn't, yet you say "fix that up silently". Bug? > This reminds me what is irritating about the patch. We're adding additional code to a somewhat fast path to handle something which we know never happens, thanks to the now-removed check. This newly-added code might become functional in the future, if people add incorrect callers. Callers whose incorrectness would have been revealed by the now-removed check! So.. argh. Really, the changelog isn't right. There *is* a real reason to blow up. Effectively the caller is attempting to obtain the virtual address of a highmem page without having kmapped it first. That's an outright bug. An alternative might be to just accept the bogus __GFP_HIGHMEM, let page_to_virt() return a crap address and wait for the user bug reports to come in when someone tries to run the offending code on a highmem machine. That shouldn't take too long - the page allocator will prefer to return a highmem page in this case. And adding a rule to the various static checkers should catch most offenders. Or just leave the ode as it is now.
On Tue 26-06-18 10:04:16, Andrew Morton wrote: > On Tue, 26 Jun 2018 15:57:39 +0200 Vlastimil Babka <vbabka@suse.cz> wrote: > > > On 06/22/2018 06:28 PM, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@suse.com> > > > > > > There is no real reason to blow up just because the caller doesn't know > > > that __get_free_pages cannot return highmem pages. Simply fix that up > > > silently. Even if we have some confused users such a fixup will not be > > > harmful. > > > > > > > ... > > > > > /* > > > - * Common helper functions. > > > + * Common helper functions. Never use with __GFP_HIGHMEM because the returned > > > + * address cannot represent highmem pages. Use alloc_pages and then kmap if > > > + * you need to access high mem. > > > */ > > > unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order) > > > { > > > struct page *page; > > > > > > - /* > > > - * __get_free_pages() returns a virtual address, which cannot represent > > > - * a highmem page > > > - */ > > > - VM_BUG_ON((gfp_mask & __GFP_HIGHMEM) != 0); > > > - > > > page = alloc_pages(gfp_mask, order); > > > > The previous version had also replaced the line above with: > > > > + page = alloc_pages(gfp_mask & ~__GFP_HIGHMEM, order); > > > > This one doesn't, yet you say "fix that up silently". Bug? > > > > This reminds me what is irritating about the patch. We're adding > additional code to a somewhat fast path to handle something which we > know never happens, thanks to the now-removed check. > > This newly-added code might become functional in the future, if people > add incorrect callers. Callers whose incorrectness would have been > revealed by the now-removed check! That check depends on a debugging config option which is not enabled all the time so how does this help in most production systems? More over it is "blow up on incorrect use" kind of check. I am pretty sure Linus would have some word about such error handling... > So.. argh. > > Really, the changelog isn't right. There *is* a real reason to blow > up. Effectively the caller is attempting to obtain the virtual address > of a highmem page without having kmapped it first. That's an outright > bug. And as I've argued before the code would be wrong regardless. We would leak the memory or worse touch somebody's else kmap without knowing that. So we have a choice between a mem leak, data corruption k or a silent fixup. I would prefer the last option. And blowing up on a BUG is not much better on something that is easily fixable. I am not really convinced that & ~__GFP_HIGHMEM is something to lose sleep over. > An alternative might be to just accept the bogus __GFP_HIGHMEM, let > page_to_virt() return a crap address and wait for the user bug reports > to come in when someone tries to run the offending code on a highmem > machine. That shouldn't take too long - the page allocator will prefer > to return a highmem page in this case. Well, I would be concerned about page_address returning an active kmap much more. > And adding a rule to the various static checkers should catch most > offenders. This all sounds like a heavy lifting for something that can be easily fixed up. If we ever see the additional and with a mask as a problem we can think about a more optimal solution.
On Wed 27-06-18 09:34:20, Michal Hocko wrote: > On Tue 26-06-18 10:04:16, Andrew Morton wrote: [...] > > Really, the changelog isn't right. There *is* a real reason to blow > > up. Effectively the caller is attempting to obtain the virtual address > > of a highmem page without having kmapped it first. That's an outright > > bug. > > And as I've argued before the code would be wrong regardless. We would > leak the memory or worse touch somebody's else kmap without knowing > that. So we have a choice between a mem leak, data corruption k or a > silent fixup. I would prefer the last option. And blowing up on a BUG > is not much better on something that is easily fixable. I am not really > convinced that & ~__GFP_HIGHMEM is something to lose sleep over. It's been some time since I've checked that changelog and you are right it should contain all the above so the changelog should be: " There is no real reason to blow up just because the caller doesn't know that __get_free_pages cannot return highmem pages. Simply fix that up silently. Even if we have some confused users such a fixup will not be harmful. On the other hand an incorrect usage can lead to either a memory leak or worse a memory corruption when the allocated page hashes to an already kmaped page. Most workloads run with CONFIG_DEBUG_VM disabled so the assert wouldn't help. "
On 06/27/2018 09:34 AM, Michal Hocko wrote: > On Tue 26-06-18 10:04:16, Andrew Morton wrote: > > And as I've argued before the code would be wrong regardless. We would > leak the memory or worse touch somebody's else kmap without knowing > that. So we have a choice between a mem leak, data corruption k or a > silent fixup. I would prefer the last option. And blowing up on a BUG > is not much better on something that is easily fixable. I am not really > convinced that & ~__GFP_HIGHMEM is something to lose sleep over. Maybe put the fixup into a "#ifdef CONFIG_HIGHMEM" block and then modern systems won't care? In that case it could even be if (WARN_ON_ONCE(...)) so future cases with wrong expectations would become known. Vlastimil
On Wed 27-06-18 09:50:01, Vlastimil Babka wrote: > On 06/27/2018 09:34 AM, Michal Hocko wrote: > > On Tue 26-06-18 10:04:16, Andrew Morton wrote: > > > > And as I've argued before the code would be wrong regardless. We would > > leak the memory or worse touch somebody's else kmap without knowing > > that. So we have a choice between a mem leak, data corruption k or a > > silent fixup. I would prefer the last option. And blowing up on a BUG > > is not much better on something that is easily fixable. I am not really > > convinced that & ~__GFP_HIGHMEM is something to lose sleep over. > > Maybe put the fixup into a "#ifdef CONFIG_HIGHMEM" block and then modern > systems won't care? In that case it could even be if (WARN_ON_ONCE(...)) > so future cases with wrong expectations would become known. Yes that could be done as well. Or maybe we can make __GFP_HIGHMEM 0 for !HIGHMEM systems. Does something really rely on it being non-zero?
On 06/27/2018 09:54 AM, Michal Hocko wrote: > On Wed 27-06-18 09:50:01, Vlastimil Babka wrote: >> On 06/27/2018 09:34 AM, Michal Hocko wrote: >>> On Tue 26-06-18 10:04:16, Andrew Morton wrote: >>> >>> And as I've argued before the code would be wrong regardless. We would >>> leak the memory or worse touch somebody's else kmap without knowing >>> that. So we have a choice between a mem leak, data corruption k or a >>> silent fixup. I would prefer the last option. And blowing up on a BUG >>> is not much better on something that is easily fixable. I am not really >>> convinced that & ~__GFP_HIGHMEM is something to lose sleep over. >> >> Maybe put the fixup into a "#ifdef CONFIG_HIGHMEM" block and then modern >> systems won't care? In that case it could even be if (WARN_ON_ONCE(...)) >> so future cases with wrong expectations would become known. > > Yes that could be done as well. Or maybe we can make __GFP_HIGHMEM 0 for > !HIGHMEM systems. Does something really rely on it being non-zero? I guess gfp_zone() would have to be checked, dunno about the rewrite of GFP_ZONE_TABLE (CCing people). In general checks like "if (flags & __GFP_HIGHMEM)" would become false, which probably should not be a problem, unless something expect the flag to be there and errors out if it isn't.
On Wed 27-06-18 12:47:39, Vlastimil Babka wrote: > On 06/27/2018 09:54 AM, Michal Hocko wrote: > > On Wed 27-06-18 09:50:01, Vlastimil Babka wrote: > >> On 06/27/2018 09:34 AM, Michal Hocko wrote: > >>> On Tue 26-06-18 10:04:16, Andrew Morton wrote: > >>> > >>> And as I've argued before the code would be wrong regardless. We would > >>> leak the memory or worse touch somebody's else kmap without knowing > >>> that. So we have a choice between a mem leak, data corruption k or a > >>> silent fixup. I would prefer the last option. And blowing up on a BUG > >>> is not much better on something that is easily fixable. I am not really > >>> convinced that & ~__GFP_HIGHMEM is something to lose sleep over. > >> > >> Maybe put the fixup into a "#ifdef CONFIG_HIGHMEM" block and then modern > >> systems won't care? In that case it could even be if (WARN_ON_ONCE(...)) > >> so future cases with wrong expectations would become known. > > > > Yes that could be done as well. Or maybe we can make __GFP_HIGHMEM 0 for > > !HIGHMEM systems. Does something really rely on it being non-zero? > > I guess gfp_zone() would have to be checked, dunno about the rewrite of > GFP_ZONE_TABLE (CCing people). > In general checks like "if (flags & __GFP_HIGHMEM)" would become false, > which probably should not be a problem, unless something expect the flag > to be there and errors out if it isn't. Well, __GFP_HIGHMEM should be basically GFP_KERNEL for !highmem systems. But most checks I have seen try to mask it off. Having it 0 would help to reduce at least some code.
On Wed, 27 Jun 2018 09:50:01 +0200 Vlastimil Babka <vbabka@suse.cz> wrote: > On 06/27/2018 09:34 AM, Michal Hocko wrote: > > On Tue 26-06-18 10:04:16, Andrew Morton wrote: > > > > And as I've argued before the code would be wrong regardless. We would > > leak the memory or worse touch somebody's else kmap without knowing > > that. So we have a choice between a mem leak, data corruption k or a > > silent fixup. I would prefer the last option. And blowing up on a BUG > > is not much better on something that is easily fixable. I am not really > > convinced that & ~__GFP_HIGHMEM is something to lose sleep over. > > Maybe put the fixup into a "#ifdef CONFIG_HIGHMEM" block and then modern > systems won't care? In that case it could even be if (WARN_ON_ONCE(...)) > so future cases with wrong expectations would become known. > The more I think about it, the more I like the VM_BUG_ON. Look, if I was reviewing code which did page = alloc_page(__GFP_HIGHMEM); addr = page_to_virt(page); I would say "that's a bug, you forgot to kmap the page". And any code which does __get_free_pages(__GFP_HIGHMEM) is just as buggy: it's requesting the virtual address of a high page without having kmapped it. Core MM shouldn't be silently kludging around the bug by restricting the caller to using lowmem pages. Maybe the caller really does want to use highmem, in which case the caller should be using alloc_page(__GFP_HIGHMEM) and kmap(). Because core MM detects and reports this bug, the developer will fix it.
On Wed 27-06-18 14:14:12, Andrew Morton wrote: > On Wed, 27 Jun 2018 09:50:01 +0200 Vlastimil Babka <vbabka@suse.cz> wrote: > > > On 06/27/2018 09:34 AM, Michal Hocko wrote: > > > On Tue 26-06-18 10:04:16, Andrew Morton wrote: > > > > > > And as I've argued before the code would be wrong regardless. We would > > > leak the memory or worse touch somebody's else kmap without knowing > > > that. So we have a choice between a mem leak, data corruption k or a > > > silent fixup. I would prefer the last option. And blowing up on a BUG > > > is not much better on something that is easily fixable. I am not really > > > convinced that & ~__GFP_HIGHMEM is something to lose sleep over. > > > > Maybe put the fixup into a "#ifdef CONFIG_HIGHMEM" block and then modern > > systems won't care? In that case it could even be if (WARN_ON_ONCE(...)) > > so future cases with wrong expectations would become known. > > > > The more I think about it, the more I like the VM_BUG_ON. > > Look, if I was reviewing code which did > > page = alloc_page(__GFP_HIGHMEM); > addr = page_to_virt(page); > > I would say "that's a bug, you forgot to kmap the page". > > And any code which does __get_free_pages(__GFP_HIGHMEM) is just as > buggy: it's requesting the virtual address of a high page without > having kmapped it. Core MM shouldn't be silently kludging around the > bug by restricting the caller to using lowmem pages. I would argue that internal kernel APIs should trust their callers. Panicing with an unknown context is about the worst way to teach developers how to use the API properly. Because it will be end users seeing an outage. So I would simply not care beyond documenting the expectation. If we want to be more careful then fix it up. If you disagree then just drop the patch. I do not insist so much to spend much more time on something I thought was quite obvious. BUG_ON for an inpropoer API usage is considered harmful for quite a long time by now. I do not see why this would be any different.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1521100f1e63..5f56f662a52d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4402,18 +4402,14 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, EXPORT_SYMBOL(__alloc_pages_nodemask); /* - * Common helper functions. + * Common helper functions. Never use with __GFP_HIGHMEM because the returned + * address cannot represent highmem pages. Use alloc_pages and then kmap if + * you need to access high mem. */ unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order) { struct page *page; - /* - * __get_free_pages() returns a virtual address, which cannot represent - * a highmem page - */ - VM_BUG_ON((gfp_mask & __GFP_HIGHMEM) != 0); - page = alloc_pages(gfp_mask, order); if (!page) return 0;