Message ID | 20220502173558.2510641-1-minchan@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: fix is_pinnable_page against on cma page | expand |
On 02.05.22 19:35, Minchan Kim wrote: > Pages on CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA > so current is_pinnable_page could miss CMA pages which has MIGRATE_ > ISOLATE. It ends up putting CMA pages longterm pinning possible on > pin_user_pages APIs so CMA allocation fails. > > The CMA allocation path protects the migration type change race > using zone->lock but what GUP path need to know is just whether the > page is on CMA area or not rather than exact type. Thus, we don't > need zone->lock but just checks the migratype in either of > (MIGRATE_ISOLATE and MIGRATE_CMA). > > Signed-off-by: Minchan Kim <minchan@kernel.org> > --- > include/linux/mm.h | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 6acca5cecbc5..f59bbe3296e3 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1625,8 +1625,10 @@ static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma, > #ifdef CONFIG_MIGRATION > static inline bool is_pinnable_page(struct page *page) > { > - return !(is_zone_movable_page(page) || is_migrate_cma_page(page)) || > - is_zero_pfn(page_to_pfn(page)); > + int mt = get_pageblock_migratetype(page); > + > + return !(is_zone_movable_page(page) || mt == MIGRATE_CMA || > + mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page))); > } > #else > static inline bool is_pinnable_page(struct page *page) That implies that other memory ranges that are currently isolated (memory offlining, alloc_contig_range()) cannot be pinned. That might not be a bad thing, however, I think we could end up failing to pin something that's temporarily unmovable (due to temporary references). However, I assume we have the same issue right now already with ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these there are temporarily unmovable and we fail to migrate. But it would now apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm...
On Mon, May 02, 2022 at 08:02:31PM +0200, David Hildenbrand wrote: > On 02.05.22 19:35, Minchan Kim wrote: > > Pages on CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA > > so current is_pinnable_page could miss CMA pages which has MIGRATE_ > > ISOLATE. It ends up putting CMA pages longterm pinning possible on > > pin_user_pages APIs so CMA allocation fails. > > > > The CMA allocation path protects the migration type change race > > using zone->lock but what GUP path need to know is just whether the > > page is on CMA area or not rather than exact type. Thus, we don't > > need zone->lock but just checks the migratype in either of > > (MIGRATE_ISOLATE and MIGRATE_CMA). > > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > --- > > include/linux/mm.h | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 6acca5cecbc5..f59bbe3296e3 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1625,8 +1625,10 @@ static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma, > > #ifdef CONFIG_MIGRATION > > static inline bool is_pinnable_page(struct page *page) > > { > > - return !(is_zone_movable_page(page) || is_migrate_cma_page(page)) || > > - is_zero_pfn(page_to_pfn(page)); > > + int mt = get_pageblock_migratetype(page); > > + > > + return !(is_zone_movable_page(page) || mt == MIGRATE_CMA || > > + mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page))); > > } > > #else > > static inline bool is_pinnable_page(struct page *page) > > That implies that other memory ranges that are currently isolated > (memory offlining, alloc_contig_range()) cannot be pinned. That might > not be a bad thing, however, I think we could end up failing to pin > something that's temporarily unmovable (due to temporary references). Sure. > > However, I assume we have the same issue right now already with > ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these ZONE_MOVALBE is also changed dynamically? > there are temporarily unmovable and we fail to migrate. But it would now > apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... Didn't parse your last mention.
Hi Minchan, I love your patch! Yet something to improve: [auto build test ERROR on hnaz-mm/master] url: https://github.com/intel-lab-lkp/linux/commits/Minchan-Kim/mm-fix-is_pinnable_page-against-on-cma-page/20220503-013733 base: https://github.com/hnaz/linux-mm master config: nios2-defconfig (https://download.01.org/0day-ci/archive/20220503/202205030306.in794Jr3-lkp@intel.com/config) compiler: nios2-linux-gcc (GCC) 11.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/1b8636710c31d44310f1d344e337c207562b851d git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Minchan-Kim/mm-fix-is_pinnable_page-against-on-cma-page/20220503-013733 git checkout 1b8636710c31d44310f1d344e337c207562b851d # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross W=1 O=build_dir ARCH=nios2 prepare If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): In file included from include/linux/pid_namespace.h:7, from include/linux/ptrace.h:10, from arch/nios2/kernel/asm-offsets.c:9: include/linux/mm.h: In function 'is_pinnable_page': >> include/linux/mm.h:1630:54: error: 'MIGRATE_CMA' undeclared (first use in this function); did you mean 'MIGRATE_SYNC'? 1630 | return !(is_zone_movable_page(page) || mt == MIGRATE_CMA || | ^~~~~~~~~~~ | MIGRATE_SYNC include/linux/mm.h:1630:54: note: each undeclared identifier is reported only once for each function it appears in >> include/linux/mm.h:1631:23: error: 'MIGRATE_ISOLATE' undeclared (first use in this function); did you mean 'MIGRATE_MOVABLE'? 1631 | mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page))); | ^~~~~~~~~~~~~~~ | MIGRATE_MOVABLE make[2]: *** [scripts/Makefile.build:122: arch/nios2/kernel/asm-offsets.s] Error 1 make[2]: Target '__build' not remade because of errors. make[1]: *** [Makefile:1283: prepare0] Error 2 make[1]: Target 'prepare' not remade because of errors. make: *** [Makefile:226: __sub-make] Error 2 make: Target 'prepare' not remade because of errors. vim +1630 include/linux/mm.h 1623 1624 /* MIGRATE_CMA and ZONE_MOVABLE do not allow pin pages */ 1625 #ifdef CONFIG_MIGRATION 1626 static inline bool is_pinnable_page(struct page *page) 1627 { 1628 int mt = get_pageblock_migratetype(page); 1629 > 1630 return !(is_zone_movable_page(page) || mt == MIGRATE_CMA || > 1631 mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page))); 1632 } 1633 #else 1634 static inline bool is_pinnable_page(struct page *page) 1635 { 1636 return true; 1637 } 1638 #endif 1639
Hi Minchan, I love your patch! Yet something to improve: [auto build test ERROR on hnaz-mm/master] url: https://github.com/intel-lab-lkp/linux/commits/Minchan-Kim/mm-fix-is_pinnable_page-against-on-cma-page/20220503-013733 base: https://github.com/hnaz/linux-mm master config: x86_64-randconfig-a012-20220502 (https://download.01.org/0day-ci/archive/20220503/202205030525.VFo3Ds9w-lkp@intel.com/config) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 09325d36061e42b495d1f4c7e933e260eac260ed) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/1b8636710c31d44310f1d344e337c207562b851d git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Minchan-Kim/mm-fix-is_pinnable_page-against-on-cma-page/20220503-013733 git checkout 1b8636710c31d44310f1d344e337c207562b851d # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 prepare If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): In file included from arch/x86/kernel/asm-offsets.c:13: In file included from include/linux/suspend.h:5: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:20: >> include/linux/mm.h:1630:47: error: use of undeclared identifier 'MIGRATE_CMA'; did you mean 'MIGRATE_SYNC'? return !(is_zone_movable_page(page) || mt == MIGRATE_CMA || ^~~~~~~~~~~ MIGRATE_SYNC include/linux/migrate_mode.h:18:2: note: 'MIGRATE_SYNC' declared here MIGRATE_SYNC, ^ In file included from arch/x86/kernel/asm-offsets.c:13: In file included from include/linux/suspend.h:5: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:20: >> include/linux/mm.h:1631:9: error: use of undeclared identifier 'MIGRATE_ISOLATE'; did you mean 'MIGRATE_MOVABLE'? mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page))); ^~~~~~~~~~~~~~~ MIGRATE_MOVABLE include/linux/mmzone.h:44:2: note: 'MIGRATE_MOVABLE' declared here MIGRATE_MOVABLE, ^ 2 errors generated. make[2]: *** [scripts/Makefile.build:122: arch/x86/kernel/asm-offsets.s] Error 1 make[2]: Target '__build' not remade because of errors. make[1]: *** [Makefile:1283: prepare0] Error 2 make[1]: Target 'prepare' not remade because of errors. make: *** [Makefile:226: __sub-make] Error 2 make: Target 'prepare' not remade because of errors. vim +1630 include/linux/mm.h 1623 1624 /* MIGRATE_CMA and ZONE_MOVABLE do not allow pin pages */ 1625 #ifdef CONFIG_MIGRATION 1626 static inline bool is_pinnable_page(struct page *page) 1627 { 1628 int mt = get_pageblock_migratetype(page); 1629 > 1630 return !(is_zone_movable_page(page) || mt == MIGRATE_CMA || > 1631 mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page))); 1632 } 1633 #else 1634 static inline bool is_pinnable_page(struct page *page) 1635 { 1636 return true; 1637 } 1638 #endif 1639
>>>> However, I assume we have the same issue right now already with >> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these > > ZONE_MOVALBE is also changed dynamically? > Sorry, with "same issue" I meant failing to pin if having to migrate and the page is temporarily unmovable. >> there are temporarily unmovable and we fail to migrate. But it would now >> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... > > Didn't parse your last mention. On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have to migrate now when pinning.
Hi Minchan,
I love your patch! Perhaps something to improve:
[auto build test WARNING on hnaz-mm/master]
url: https://github.com/intel-lab-lkp/linux/commits/Minchan-Kim/mm-fix-is_pinnable_page-against-on-cma-page/20220503-013733
base: https://github.com/hnaz/linux-mm master
config: hexagon-randconfig-r034-20220501 (https://download.01.org/0day-ci/archive/20220503/202205031644.EtLKbqIX-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 363b3a645a1e30011cc8da624f13dac5fd915628)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/1b8636710c31d44310f1d344e337c207562b851d
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Minchan-Kim/mm-fix-is_pinnable_page-against-on-cma-page/20220503-013733
git checkout 1b8636710c31d44310f1d344e337c207562b851d
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
In file included from fs/statfs.c:2:
In file included from include/linux/syscalls.h:88:
In file included from include/trace/syscall.h:7:
In file included from include/linux/trace_events.h:6:
In file included from include/linux/ring_buffer.h:5:
include/linux/mm.h:1630:47: error: use of undeclared identifier 'MIGRATE_CMA'; did you mean 'MIGRATE_SYNC'?
return !(is_zone_movable_page(page) || mt == MIGRATE_CMA ||
^~~~~~~~~~~
MIGRATE_SYNC
include/linux/migrate_mode.h:18:2: note: 'MIGRATE_SYNC' declared here
MIGRATE_SYNC,
^
In file included from fs/statfs.c:2:
In file included from include/linux/syscalls.h:88:
In file included from include/trace/syscall.h:7:
In file included from include/linux/trace_events.h:6:
In file included from include/linux/ring_buffer.h:5:
include/linux/mm.h:1631:9: error: use of undeclared identifier 'MIGRATE_ISOLATE'; did you mean 'MIGRATE_MOVABLE'?
mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page)));
^~~~~~~~~~~~~~~
MIGRATE_MOVABLE
include/linux/mmzone.h:44:2: note: 'MIGRATE_MOVABLE' declared here
MIGRATE_MOVABLE,
^
>> fs/statfs.c:131:3: warning: 'memcpy' will always overflow; destination buffer has size 64, but size argument is 88 [-Wfortify-source]
memcpy(&buf, st, sizeof(*st));
^
1 warning and 2 errors generated.
vim +/memcpy +131 fs/statfs.c
c8b91accfa1059 Al Viro 2011-03-12 125
c8b91accfa1059 Al Viro 2011-03-12 126 static int do_statfs_native(struct kstatfs *st, struct statfs __user *p)
c8b91accfa1059 Al Viro 2011-03-12 127 {
c8b91accfa1059 Al Viro 2011-03-12 128 struct statfs buf;
7ed1ee6118ae77 Al Viro 2010-03-23 129
c8b91accfa1059 Al Viro 2011-03-12 130 if (sizeof(buf) == sizeof(*st))
c8b91accfa1059 Al Viro 2011-03-12 @131 memcpy(&buf, st, sizeof(*st));
7ed1ee6118ae77 Al Viro 2010-03-23 132 else {
c8b91accfa1059 Al Viro 2011-03-12 133 if (sizeof buf.f_blocks == 4) {
c8b91accfa1059 Al Viro 2011-03-12 134 if ((st->f_blocks | st->f_bfree | st->f_bavail |
c8b91accfa1059 Al Viro 2011-03-12 135 st->f_bsize | st->f_frsize) &
7ed1ee6118ae77 Al Viro 2010-03-23 136 0xffffffff00000000ULL)
7ed1ee6118ae77 Al Viro 2010-03-23 137 return -EOVERFLOW;
7ed1ee6118ae77 Al Viro 2010-03-23 138 /*
7ed1ee6118ae77 Al Viro 2010-03-23 139 * f_files and f_ffree may be -1; it's okay to stuff
7ed1ee6118ae77 Al Viro 2010-03-23 140 * that into 32 bits
7ed1ee6118ae77 Al Viro 2010-03-23 141 */
c8b91accfa1059 Al Viro 2011-03-12 142 if (st->f_files != -1 &&
c8b91accfa1059 Al Viro 2011-03-12 143 (st->f_files & 0xffffffff00000000ULL))
7ed1ee6118ae77 Al Viro 2010-03-23 144 return -EOVERFLOW;
c8b91accfa1059 Al Viro 2011-03-12 145 if (st->f_ffree != -1 &&
c8b91accfa1059 Al Viro 2011-03-12 146 (st->f_ffree & 0xffffffff00000000ULL))
7ed1ee6118ae77 Al Viro 2010-03-23 147 return -EOVERFLOW;
7ed1ee6118ae77 Al Viro 2010-03-23 148 }
7ed1ee6118ae77 Al Viro 2010-03-23 149
c8b91accfa1059 Al Viro 2011-03-12 150 buf.f_type = st->f_type;
c8b91accfa1059 Al Viro 2011-03-12 151 buf.f_bsize = st->f_bsize;
c8b91accfa1059 Al Viro 2011-03-12 152 buf.f_blocks = st->f_blocks;
c8b91accfa1059 Al Viro 2011-03-12 153 buf.f_bfree = st->f_bfree;
c8b91accfa1059 Al Viro 2011-03-12 154 buf.f_bavail = st->f_bavail;
c8b91accfa1059 Al Viro 2011-03-12 155 buf.f_files = st->f_files;
c8b91accfa1059 Al Viro 2011-03-12 156 buf.f_ffree = st->f_ffree;
c8b91accfa1059 Al Viro 2011-03-12 157 buf.f_fsid = st->f_fsid;
c8b91accfa1059 Al Viro 2011-03-12 158 buf.f_namelen = st->f_namelen;
c8b91accfa1059 Al Viro 2011-03-12 159 buf.f_frsize = st->f_frsize;
c8b91accfa1059 Al Viro 2011-03-12 160 buf.f_flags = st->f_flags;
c8b91accfa1059 Al Viro 2011-03-12 161 memset(buf.f_spare, 0, sizeof(buf.f_spare));
c8b91accfa1059 Al Viro 2011-03-12 162 }
c8b91accfa1059 Al Viro 2011-03-12 163 if (copy_to_user(p, &buf, sizeof(buf)))
c8b91accfa1059 Al Viro 2011-03-12 164 return -EFAULT;
7ed1ee6118ae77 Al Viro 2010-03-23 165 return 0;
7ed1ee6118ae77 Al Viro 2010-03-23 166 }
7ed1ee6118ae77 Al Viro 2010-03-23 167
On Tue, May 03, 2022 at 03:15:24AM +0200, David Hildenbrand wrote: > > >>>> However, I assume we have the same issue right now already with > >> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these > > > > ZONE_MOVALBE is also changed dynamically? > > > > Sorry, with "same issue" I meant failing to pin if having to migrate and > the page is temporarily unmovable. > > >> there are temporarily unmovable and we fail to migrate. But it would now > >> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... > > > > Didn't parse your last mention. > > On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have > to migrate now when pinning. I don't understand your point. My problem is pin_user_pages with FOLL_LONGTERM. It shouldn't pin a page from ZONE_MOVABLE and cma area without migrating page out of movable zone or CMA area. That's why try_grab_folio checks whether target page stays in those movable areas. However, to check CMA area, is_migrate_cma_page is racy so the FOLL_LONGTERM flag semantic is broken right now. Do you see any problem of the fix? A thing to get some attention is whether we need READ_ONCE or not for the local variable mt.
On 03.05.22 17:26, Minchan Kim wrote: > On Tue, May 03, 2022 at 03:15:24AM +0200, David Hildenbrand wrote: >> >>>>>> However, I assume we have the same issue right now already with >>>> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these >>> >>> ZONE_MOVALBE is also changed dynamically? >>> >> >> Sorry, with "same issue" I meant failing to pin if having to migrate and >> the page is temporarily unmovable. >> >>>> there are temporarily unmovable and we fail to migrate. But it would now >>>> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... >>> >>> Didn't parse your last mention. >> >> On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have >> to migrate now when pinning. > > I don't understand your point. My problem is pin_user_pages with > FOLL_LONGTERM. It shouldn't pin a page from ZONE_MOVABLE and cma area > without migrating page out of movable zone or CMA area. > That's why try_grab_folio checks whether target page stays in those > movable areas. However, to check CMA area, is_migrate_cma_page is > racy so the FOLL_LONGTERM flag semantic is broken right now. > > Do you see any problem of the fix? My point is that you might decide to migrate a page because you stumble over MIGRATE_ISOLATE, although there is no need to reject long-term pinning and to trigger page migration. Assume a system without ZONE_MOVABLE and without MIGRATE_CMA. Assume someone reserves gigantic pages (alloc_contig_range()) and you have concurrent long-term pinning on a page that is no MIGRATE_ISOLATE. GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to be migrated, which can fail if the page is temporarily unmovable. See my point? We will try migrating in cases where we don't have to migrate. I think what we would want to do is always reject pinning a CMA page, independent of the isolation status. but we don't have that information available. I raised in the past that we should look into preserving the migration type and turning MIGRATE_ISOLATE essentially into an additional flag. So I guess this patch is the right thing to do for now, but I wanted to spell out the implications. > > A thing to get some attention is whether we need READ_ONCE or not > for the local variable mt. > Hmm good point. Staring at __get_pfnblock_flags_mask(), I don't think there is anything stopping the compiler from re-reading the value. But we don't care if we're reading MIGRATE_CMA or MIGRATE_ISOLATE, not something in between.
On Tue, May 03, 2022 at 06:02:33PM +0200, David Hildenbrand wrote: > On 03.05.22 17:26, Minchan Kim wrote: > > On Tue, May 03, 2022 at 03:15:24AM +0200, David Hildenbrand wrote: > >> > >>>>>> However, I assume we have the same issue right now already with > >>>> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these > >>> > >>> ZONE_MOVALBE is also changed dynamically? > >>> > >> > >> Sorry, with "same issue" I meant failing to pin if having to migrate and > >> the page is temporarily unmovable. > >> > >>>> there are temporarily unmovable and we fail to migrate. But it would now > >>>> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... > >>> > >>> Didn't parse your last mention. > >> > >> On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have > >> to migrate now when pinning. > > > > I don't understand your point. My problem is pin_user_pages with > > FOLL_LONGTERM. It shouldn't pin a page from ZONE_MOVABLE and cma area > > without migrating page out of movable zone or CMA area. > > That's why try_grab_folio checks whether target page stays in those > > movable areas. However, to check CMA area, is_migrate_cma_page is > > racy so the FOLL_LONGTERM flag semantic is broken right now. > > > > Do you see any problem of the fix? > > My point is that you might decide to migrate a page because you stumble > over MIGRATE_ISOLATE, although there is no need to reject long-term > pinning and to trigger page migration. > > Assume a system without ZONE_MOVABLE and without MIGRATE_CMA. Assume > someone reserves gigantic pages (alloc_contig_range()) and you have > concurrent long-term pinning on a page that is no MIGRATE_ISOLATE. > > GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to > be migrated, which can fail if the page is temporarily unmovable. Why is the page temporarily unmovable? The GUP didn't increase the refcount in the case. If it's not migrabtable, that's not a fault from the GUP but someone is already holding the temporal refcount. It's not the scope this patchset would try to solve it. > > See my point? We will try migrating in cases where we don't have to Still not clear for me what you are concerning. > migrate. I think what we would want to do is always reject pinning a CMA > page, independent of the isolation status. but we don't have that Always reject pinning a CMA page if it is *FOLL_LONGTERM* > information available. page && (MIGRATE_CMA | MIGRATE_ISOLATE) && gup_flags is not enough for it? > > I raised in the past that we should look into preserving the migration > type and turning MIGRATE_ISOLATE essentially into an additional flag. > > > So I guess this patch is the right thing to do for now, but I wanted to > spell out the implications. I want but still don't understand what you want to write further about the implication parts. If you make more clear, I am happy to include it. > > > > > A thing to get some attention is whether we need READ_ONCE or not > > for the local variable mt. > > > > Hmm good point. Staring at __get_pfnblock_flags_mask(), I don't think > there is anything stopping the compiler from re-reading the value. But > we don't care if we're reading MIGRATE_CMA or MIGRATE_ISOLATE, not > something in between. How about this? CPU A CPU B is_pinnable_page .. .. set_pageblock_migratetype(MIGRATE_ISOLATE) mt == MIGRATE_CMA get_pageblock_miratetype(page) returns MIGRATE_ISOLATE mt == MIGRATE_ISOLATE set_pageblock_migratetype(MIGRATE_CMA) get_pageblock_miratetype(page) returns MIGRATE_CMA So both conditions fails to detect it.
>> GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to >> be migrated, which can fail if the page is temporarily unmovable. > > Why is the page temporarily unmovable? The GUP didn't increase the > refcount in the case. If it's not migrabtable, that's not a fault > from the GUP but someone is already holding the temporal refcount. > It's not the scope this patchset would try to solve it. You can have other references on the page that turn it temporarily unmovable, for example, via FOLL_GET, short-term FOLL_PIN. > >> >> See my point? We will try migrating in cases where we don't have to > > Still not clear for me what you are concerning. > >> migrate. I think what we would want to do is always reject pinning a CMA >> page, independent of the isolation status. but we don't have that > > Always reject pinning a CMA page if it is *FOLL_LONGTERM* Yes. > >> information available. > > page && (MIGRATE_CMA | MIGRATE_ISOLATE) && gup_flags is not enough > for it? > >> >> I raised in the past that we should look into preserving the migration >> type and turning MIGRATE_ISOLATE essentially into an additional flag. >> >> >> So I guess this patch is the right thing to do for now, but I wanted to >> spell out the implications. > > I want but still don't understand what you want to write further > about the implication parts. If you make more clear, I am happy to > include it. What I am essentially saying is that when rejecting to long-term FOLL_PIN something that is MIGRATE_ISOLATE now, we might now end up having to migrate pages that are actually fine to get pinned, because they are not actual CMA pages. And any such migration might fail when pages are temporarily unmovable. > >> >>> >>> A thing to get some attention is whether we need READ_ONCE or not >>> for the local variable mt. >>> >> >> Hmm good point. Staring at __get_pfnblock_flags_mask(), I don't think >> there is anything stopping the compiler from re-reading the value. But >> we don't care if we're reading MIGRATE_CMA or MIGRATE_ISOLATE, not >> something in between. > > How about this? > > CPU A CPU B > > is_pinnable_page > .. > .. set_pageblock_migratetype(MIGRATE_ISOLATE) > mt == MIGRATE_CMA > get_pageblock_miratetype(page) > returns MIGRATE_ISOLATE > mt == MIGRATE_ISOLATE set_pageblock_migratetype(MIGRATE_CMA) > get_pageblock_miratetype(page) > returns MIGRATE_CMA > > So both conditions fails to detect it. I think you're right. That's nasty.
On Tue, May 03, 2022 at 07:27:06PM +0200, David Hildenbrand wrote: > >> GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to > >> be migrated, which can fail if the page is temporarily unmovable. > > > > Why is the page temporarily unmovable? The GUP didn't increase the > > refcount in the case. If it's not migrabtable, that's not a fault > > from the GUP but someone is already holding the temporal refcount. > > It's not the scope this patchset would try to solve it. > > You can have other references on the page that turn it temporarily > unmovable, for example, via FOLL_GET, short-term FOLL_PIN. Sure. However, user didn't passed the FOLL_LONGTERM. In that case, the temporal page migration failure was expected. What we want to guarantee for successful page migration is only FOLL_LONGTERM. If you are talking about the general problem(any GUP API without FOLL_LONGTERM flag which is supposed to be short-term could turn into long-term pinning by several reasons - I had struggled with those issues - FOLL_LONGTERM is misnormer to me), yeah, I agree we need to fix it but it's orthgonal issue. > > > > >> > >> See my point? We will try migrating in cases where we don't have to > > > > Still not clear for me what you are concerning. > > > >> migrate. I think what we would want to do is always reject pinning a CMA > >> page, independent of the isolation status. but we don't have that > > > > Always reject pinning a CMA page if it is *FOLL_LONGTERM* > > Yes. > > > > >> information available. > > > > page && (MIGRATE_CMA | MIGRATE_ISOLATE) && gup_flags is not enough > > for it? > > > >> > >> I raised in the past that we should look into preserving the migration > >> type and turning MIGRATE_ISOLATE essentially into an additional flag. > >> > >> > >> So I guess this patch is the right thing to do for now, but I wanted to > >> spell out the implications. > > > > I want but still don't understand what you want to write further > > about the implication parts. If you make more clear, I am happy to > > include it. > > What I am essentially saying is that when rejecting to long-term > FOLL_PIN something that is MIGRATE_ISOLATE now, we might now end up > having to migrate pages that are actually fine to get pinned, because > they are not actual CMA pages. And any such migration might fail when > pages are temporarily unmovable. Now I understand concern. Then how about introducing cma areas list and use it instead of migrate type in is_pinnable_page struct cma { .. .. list_head list }; bool is_cma_page(unsigned long pfn) { for cma in cma_list if (pfn >= cma->base_pfn && pfn < cma->base_pfn + count return true; return false; } Do you want to fix it at this moment or just write down the possibility in the description and then we could fix once it really happens later? > > > > > >> > >>> > >>> A thing to get some attention is whether we need READ_ONCE or not > >>> for the local variable mt. > >>> > >> > >> Hmm good point. Staring at __get_pfnblock_flags_mask(), I don't think > >> there is anything stopping the compiler from re-reading the value. But > >> we don't care if we're reading MIGRATE_CMA or MIGRATE_ISOLATE, not > >> something in between. > > > > How about this? > > > > CPU A CPU B > > > > is_pinnable_page > > .. > > .. set_pageblock_migratetype(MIGRATE_ISOLATE) > > mt == MIGRATE_CMA > > get_pageblock_miratetype(page) > > returns MIGRATE_ISOLATE > > mt == MIGRATE_ISOLATE set_pageblock_migratetype(MIGRATE_CMA) > > get_pageblock_miratetype(page) > > returns MIGRATE_CMA > > > > So both conditions fails to detect it. > > I think you're right. That's nasty. Ccing Paul to borrow expertise. :) int mt = get_pageblock_migratetype(page); if (mt == MIGRATE_CMA) return true; if (mt == MIGRATE_ISOLATE) return true; I'd like to keep use the local variable mt's value in folloing conditions checks instead of refetching the value from get_pageblock_migratetype. What's the right way to achieve it? Thanks in advance!
On Tue, May 03, 2022 at 11:08:25AM -0700, Minchan Kim wrote: < snip > Ccing Paul really this time. Attach original code for Paul. diff --git a/include/linux/mm.h b/include/linux/mm.h index 6acca5cecbc5..f59bbe3296e3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1625,8 +1625,10 @@ static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma, #ifdef CONFIG_MIGRATION static inline bool is_pinnable_page(struct page *page) { - return !(is_zone_movable_page(page) || is_migrate_cma_page(page)) || - is_zero_pfn(page_to_pfn(page)); + int mt = get_pageblock_migratetype(page); + + return !(is_zone_movable_page(page) || mt == MIGRATE_CMA || + mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page))); } #else static inline bool is_pinnable_page(struct page *page) -- 2.36.0.464.gb9c8b46e94-goog > > >>> > > >>> A thing to get some attention is whether we need READ_ONCE or not > > >>> for the local variable mt. > > >>> > > >> > > >> Hmm good point. Staring at __get_pfnblock_flags_mask(), I don't think > > >> there is anything stopping the compiler from re-reading the value. But > > >> we don't care if we're reading MIGRATE_CMA or MIGRATE_ISOLATE, not > > >> something in between. > > > > > > How about this? > > > > > > CPU A CPU B > > > > > > is_pinnable_page > > > .. > > > .. set_pageblock_migratetype(MIGRATE_ISOLATE) > > > mt == MIGRATE_CMA > > > get_pageblock_miratetype(page) > > > returns MIGRATE_ISOLATE > > > mt == MIGRATE_ISOLATE set_pageblock_migratetype(MIGRATE_CMA) > > > get_pageblock_miratetype(page) > > > returns MIGRATE_CMA > > > > > > So both conditions fails to detect it. > > > > I think you're right. That's nasty. > > Ccing Paul to borrow expertise. :) > > int mt = get_pageblock_migratetype(page); > > if (mt == MIGRATE_CMA) > return true; > > if (mt == MIGRATE_ISOLATE) > return true; > > I'd like to keep use the local variable mt's value in folloing > conditions checks instead of refetching the value from > get_pageblock_migratetype. > > What's the right way to achieve it? > > Thanks in advance! Paul, could you give any hint?
On Tue, May 03, 2022 at 06:02:33PM +0200, David Hildenbrand wrote: > On 03.05.22 17:26, Minchan Kim wrote: > > On Tue, May 03, 2022 at 03:15:24AM +0200, David Hildenbrand wrote: > >> > >>>>>> However, I assume we have the same issue right now already with > >>>> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these > >>> > >>> ZONE_MOVALBE is also changed dynamically? > >>> > >> > >> Sorry, with "same issue" I meant failing to pin if having to migrate and > >> the page is temporarily unmovable. > >> > >>>> there are temporarily unmovable and we fail to migrate. But it would now > >>>> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... > >>> > >>> Didn't parse your last mention. > >> > >> On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have > >> to migrate now when pinning. > > > > I don't understand your point. My problem is pin_user_pages with > > FOLL_LONGTERM. It shouldn't pin a page from ZONE_MOVABLE and cma area > > without migrating page out of movable zone or CMA area. > > That's why try_grab_folio checks whether target page stays in those > > movable areas. However, to check CMA area, is_migrate_cma_page is > > racy so the FOLL_LONGTERM flag semantic is broken right now. > > > > Do you see any problem of the fix? > > My point is that you might decide to migrate a page because you stumble > over MIGRATE_ISOLATE, although there is no need to reject long-term > pinning and to trigger page migration. > > Assume a system without ZONE_MOVABLE and without MIGRATE_CMA. Assume > someone reserves gigantic pages (alloc_contig_range()) and you have > concurrent long-term pinning on a page that is no MIGRATE_ISOLATE. > > GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to > be migrated, which can fail if the page is temporarily unmovable. A dump question since I'm not familiar with hugetlb. Is above reasonable scenario? The gigantic page is about to be created using alloc_contig_range so they has MIGRATE_ISOLATE as temporal state. It means no one uses the page yet so I guess the page is not mapped at userspace but other is trying to access the page using pin_user_pages? > > See my point? We will try migrating in cases where we don't have to > migrate. I think what we would want to do is always reject pinning a CMA > page, independent of the isolation status. but we don't have that > information available. > > I raised in the past that we should look into preserving the migration > type and turning MIGRATE_ISOLATE essentially into an additional flag. > > > So I guess this patch is the right thing to do for now, but I wanted to > spell out the implications.
On Wed, May 04, 2022 at 03:48:54PM -0700, Minchan Kim wrote: > On Tue, May 03, 2022 at 06:02:33PM +0200, David Hildenbrand wrote: > > On 03.05.22 17:26, Minchan Kim wrote: > > > On Tue, May 03, 2022 at 03:15:24AM +0200, David Hildenbrand wrote: > > >> > > >>>>>> However, I assume we have the same issue right now already with > > >>>> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these > > >>> > > >>> ZONE_MOVALBE is also changed dynamically? > > >>> > > >> > > >> Sorry, with "same issue" I meant failing to pin if having to migrate and > > >> the page is temporarily unmovable. > > >> > > >>>> there are temporarily unmovable and we fail to migrate. But it would now > > >>>> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... > > >>> > > >>> Didn't parse your last mention. > > >> > > >> On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have > > >> to migrate now when pinning. > > > > > > I don't understand your point. My problem is pin_user_pages with > > > FOLL_LONGTERM. It shouldn't pin a page from ZONE_MOVABLE and cma area > > > without migrating page out of movable zone or CMA area. > > > That's why try_grab_folio checks whether target page stays in those > > > movable areas. However, to check CMA area, is_migrate_cma_page is > > > racy so the FOLL_LONGTERM flag semantic is broken right now. > > > > > > Do you see any problem of the fix? > > > > My point is that you might decide to migrate a page because you stumble > > over MIGRATE_ISOLATE, although there is no need to reject long-term > > pinning and to trigger page migration. > > > > Assume a system without ZONE_MOVABLE and without MIGRATE_CMA. Assume > > someone reserves gigantic pages (alloc_contig_range()) and you have > > concurrent long-term pinning on a page that is no MIGRATE_ISOLATE. > > > > GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to > > be migrated, which can fail if the page is temporarily unmovable. > > A dump question since I'm not familiar with hugetlb. > > Is above reasonable scenario? > > The gigantic page is about to be created using alloc_contig_range so > they has MIGRATE_ISOLATE as temporal state. It means no one uses the > page yet so I guess the page is not mapped at userspace but other is > trying to access the page using pin_user_pages? > Too dump question. Never mind. Posted v2 - https://lore.kernel.org/all/20220505064429.2818496-1-minchan@kernel.org/T/#u Thanks.
On 5/4/22 23:48, Minchan Kim wrote: > On Wed, May 04, 2022 at 03:48:54PM -0700, Minchan Kim wrote: >> On Tue, May 03, 2022 at 06:02:33PM +0200, David Hildenbrand wrote: >>> On 03.05.22 17:26, Minchan Kim wrote: >>>> On Tue, May 03, 2022 at 03:15:24AM +0200, David Hildenbrand wrote: >>>>> >>>>>>>>> However, I assume we have the same issue right now already with >>>>>>> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these >>>>>> >>>>>> ZONE_MOVALBE is also changed dynamically? >>>>>> >>>>> >>>>> Sorry, with "same issue" I meant failing to pin if having to migrate and >>>>> the page is temporarily unmovable. >>>>> >>>>>>> there are temporarily unmovable and we fail to migrate. But it would now >>>>>>> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... >>>>>> >>>>>> Didn't parse your last mention. >>>>> >>>>> On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have >>>>> to migrate now when pinning. >>>> >>>> I don't understand your point. My problem is pin_user_pages with >>>> FOLL_LONGTERM. It shouldn't pin a page from ZONE_MOVABLE and cma area >>>> without migrating page out of movable zone or CMA area. >>>> That's why try_grab_folio checks whether target page stays in those >>>> movable areas. However, to check CMA area, is_migrate_cma_page is >>>> racy so the FOLL_LONGTERM flag semantic is broken right now. >>>> >>>> Do you see any problem of the fix? >>> >>> My point is that you might decide to migrate a page because you stumble >>> over MIGRATE_ISOLATE, although there is no need to reject long-term >>> pinning and to trigger page migration. >>> >>> Assume a system without ZONE_MOVABLE and without MIGRATE_CMA. Assume >>> someone reserves gigantic pages (alloc_contig_range()) and you have >>> concurrent long-term pinning on a page that is no MIGRATE_ISOLATE. >>> >>> GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to >>> be migrated, which can fail if the page is temporarily unmovable. >> >> A dump question since I'm not familiar with hugetlb. >> >> Is above reasonable scenario? >> >> The gigantic page is about to be created using alloc_contig_range so >> they has MIGRATE_ISOLATE as temporal state. It means no one uses the >> page yet so I guess the page is not mapped at userspace but other is >> trying to access the page using pin_user_pages? >> > > Too dump question. Never mind. > Posted v2 - https://lore.kernel.org/all/20220505064429.2818496-1-minchan@kernel.org/T/#u > Well your question mentioned hugetlb so my mail filters caught it :) Your question caused me to think of the following. No need for any immediate change: I think. Just wanted to share. Suppose someone has reserved CMA for gigantic hugetlb allocations. And, suppose FOLL_LONGTERM is attempted on such a page (it would be in use). The desired action would be to migrate the page out of CMA. Correct? Gigantic pages can only be migrated IF there is another (already allocated) gigantic page available. The routine to try and allocate a page 'on the fly' for migration will fail if passed a gigantic size. There 'might' be a free pre-allocated gigantic page. However, if the user set up CMA reserves for gigantic page allocations it is likely the free gigantic page is also in CMA. Therefore, it can not be used for this migration. So, unless my reasoning is wrong, FOLL_LONGTERM would almost always fail for gigantic pages in CMA.
On Thu, May 05, 2022 at 10:00:07AM -0700, Mike Kravetz wrote: > Gigantic pages can only be migrated IF there is another (already allocated) > gigantic page available. The routine to try and allocate a page 'on the fly' > for migration will fail if passed a gigantic size. There 'might' be a free > pre-allocated gigantic page. However, if the user set up CMA reserves for > gigantic page allocations it is likely the free gigantic page is also in CMA. > Therefore, it can not be used for this migration. So, unless my reasoning > is wrong, FOLL_LONGTERM would almost always fail for gigantic pages in CMA. I'm probably not familiar enough with CMA, but.. I just noticed that if CMA is destined to not be able to be pinned then maybe it'll lose quite a few scenarios where pinning is a possible use case. It doesn't even need to be the major use case, but as long as it's possible (e.g. hypervisors hosting virtual machines with device assignment) it'll be a hard no to CMA, which seems to be a pity.
On Thu, May 05, 2022 at 10:00:07AM -0700, Mike Kravetz wrote: > On 5/4/22 23:48, Minchan Kim wrote: > > On Wed, May 04, 2022 at 03:48:54PM -0700, Minchan Kim wrote: > >> On Tue, May 03, 2022 at 06:02:33PM +0200, David Hildenbrand wrote: > >>> On 03.05.22 17:26, Minchan Kim wrote: > >>>> On Tue, May 03, 2022 at 03:15:24AM +0200, David Hildenbrand wrote: > >>>>> > >>>>>>>>> However, I assume we have the same issue right now already with > >>>>>>> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these > >>>>>> > >>>>>> ZONE_MOVALBE is also changed dynamically? > >>>>>> > >>>>> > >>>>> Sorry, with "same issue" I meant failing to pin if having to migrate and > >>>>> the page is temporarily unmovable. > >>>>> > >>>>>>> there are temporarily unmovable and we fail to migrate. But it would now > >>>>>>> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... > >>>>>> > >>>>>> Didn't parse your last mention. > >>>>> > >>>>> On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have > >>>>> to migrate now when pinning. > >>>> > >>>> I don't understand your point. My problem is pin_user_pages with > >>>> FOLL_LONGTERM. It shouldn't pin a page from ZONE_MOVABLE and cma area > >>>> without migrating page out of movable zone or CMA area. > >>>> That's why try_grab_folio checks whether target page stays in those > >>>> movable areas. However, to check CMA area, is_migrate_cma_page is > >>>> racy so the FOLL_LONGTERM flag semantic is broken right now. > >>>> > >>>> Do you see any problem of the fix? > >>> > >>> My point is that you might decide to migrate a page because you stumble > >>> over MIGRATE_ISOLATE, although there is no need to reject long-term > >>> pinning and to trigger page migration. > >>> > >>> Assume a system without ZONE_MOVABLE and without MIGRATE_CMA. Assume > >>> someone reserves gigantic pages (alloc_contig_range()) and you have > >>> concurrent long-term pinning on a page that is no MIGRATE_ISOLATE. > >>> > >>> GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to > >>> be migrated, which can fail if the page is temporarily unmovable. > >> > >> A dump question since I'm not familiar with hugetlb. > >> > >> Is above reasonable scenario? > >> > >> The gigantic page is about to be created using alloc_contig_range so > >> they has MIGRATE_ISOLATE as temporal state. It means no one uses the > >> page yet so I guess the page is not mapped at userspace but other is > >> trying to access the page using pin_user_pages? > >> > > > > Too dump question. Never mind. > > Posted v2 - https://lore.kernel.org/all/20220505064429.2818496-1-minchan@kernel.org/T/#u > > > > Well your question mentioned hugetlb so my mail filters caught it :) > > Your question caused me to think of the following. No need for any immediate > change: I think. Just wanted to share. > > Suppose someone has reserved CMA for gigantic hugetlb allocations. And, > suppose FOLL_LONGTERM is attempted on such a page (it would be in use). The > desired action would be to migrate the page out of CMA. Correct? > > Gigantic pages can only be migrated IF there is another (already allocated) > gigantic page available. The routine to try and allocate a page 'on the fly' > for migration will fail if passed a gigantic size. There 'might' be a free > pre-allocated gigantic page. However, if the user set up CMA reserves for > gigantic page allocations it is likely the free gigantic page is also in CMA. > Therefore, it can not be used for this migration. So, unless my reasoning > is wrong, FOLL_LONGTERM would almost always fail for gigantic pages in CMA. FOLL_LONGTERM on CMA-backed gigantic page would already fail, Thanks for sharing! Anyway, David's concern was non-CMA-backed gigantic page. The alloc_contig_range with MIGRATE_ISOLATE runs with concurrent FOLL_LONGTERM pinning, which could trigger page migration we didn't have before so it might increase FOLL_LONGTERM GUP failure rate.
On 05.05.22 08:48, Minchan Kim wrote: > On Wed, May 04, 2022 at 03:48:54PM -0700, Minchan Kim wrote: >> On Tue, May 03, 2022 at 06:02:33PM +0200, David Hildenbrand wrote: >>> On 03.05.22 17:26, Minchan Kim wrote: >>>> On Tue, May 03, 2022 at 03:15:24AM +0200, David Hildenbrand wrote: >>>>> >>>>>>>>> However, I assume we have the same issue right now already with >>>>>>> ZONE_MOVABLE and MIGRATE_CMA when trying to pin a page residing on these >>>>>> >>>>>> ZONE_MOVALBE is also changed dynamically? >>>>>> >>>>> >>>>> Sorry, with "same issue" I meant failing to pin if having to migrate and >>>>> the page is temporarily unmovable. >>>>> >>>>>>> there are temporarily unmovable and we fail to migrate. But it would now >>>>>>> apply even without ZONE_MOVABLE or MIGRATE_CMA. Hm... >>>>>> >>>>>> Didn't parse your last mention. >>>>> >>>>> On a system that neither uses ZONE_MOVABLE nor MIGRATE_CMA we might have >>>>> to migrate now when pinning. >>>> >>>> I don't understand your point. My problem is pin_user_pages with >>>> FOLL_LONGTERM. It shouldn't pin a page from ZONE_MOVABLE and cma area >>>> without migrating page out of movable zone or CMA area. >>>> That's why try_grab_folio checks whether target page stays in those >>>> movable areas. However, to check CMA area, is_migrate_cma_page is >>>> racy so the FOLL_LONGTERM flag semantic is broken right now. >>>> >>>> Do you see any problem of the fix? >>> >>> My point is that you might decide to migrate a page because you stumble >>> over MIGRATE_ISOLATE, although there is no need to reject long-term >>> pinning and to trigger page migration. >>> >>> Assume a system without ZONE_MOVABLE and without MIGRATE_CMA. Assume >>> someone reserves gigantic pages (alloc_contig_range()) and you have >>> concurrent long-term pinning on a page that is no MIGRATE_ISOLATE. >>> >>> GUP would see MIGRATE_ISOLATE and would reject pinning. The page has to >>> be migrated, which can fail if the page is temporarily unmovable. >> >> A dump question since I'm not familiar with hugetlb. >> >> Is above reasonable scenario? >> >> The gigantic page is about to be created using alloc_contig_range so >> they has MIGRATE_ISOLATE as temporal state. It means no one uses the >> page yet so I guess the page is not mapped at userspace but other is >> trying to access the page using pin_user_pages? >> > > Too dump question. Never mind. > Posted v2 - https://lore.kernel.org/all/20220505064429.2818496-1-minchan@kernel.org/T/#u Sorry for the late reply, still traveling :) Just so we're on the same page: MIGRATE_ISOLATE would be set on pageblocks that contain either free or movable pages. In case of movable pages, they are in uese. Regarding your is_cma_page() proposal, I think we might want to consider that if it really turns out to be a problem. For now, I'm fine with just documenting it. Thanks!
On 05.05.22 19:25, Peter Xu wrote: > On Thu, May 05, 2022 at 10:00:07AM -0700, Mike Kravetz wrote: >> Gigantic pages can only be migrated IF there is another (already allocated) >> gigantic page available. The routine to try and allocate a page 'on the fly' >> for migration will fail if passed a gigantic size. There 'might' be a free >> pre-allocated gigantic page. However, if the user set up CMA reserves for >> gigantic page allocations it is likely the free gigantic page is also in CMA. >> Therefore, it can not be used for this migration. So, unless my reasoning >> is wrong, FOLL_LONGTERM would almost always fail for gigantic pages in CMA. > > I'm probably not familiar enough with CMA, but.. I just noticed that if CMA > is destined to not be able to be pinned then maybe it'll lose quite a few > scenarios where pinning is a possible use case. It doesn't even need to be > the major use case, but as long as it's possible (e.g. hypervisors hosting > virtual machines with device assignment) it'll be a hard no to CMA, which > seems to be a pity. > Well, the same applies to ZONE_MOVABLE as well, unfortunately. Eventually, we might want to disable placing hugetlb pages on CMA areas if it turns out to be a problem. In case of ZONE_MOVABLE we can already fail "gracefully" when trying offlining (although that's really far from beautiful).
diff --git a/include/linux/mm.h b/include/linux/mm.h index 6acca5cecbc5..f59bbe3296e3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1625,8 +1625,10 @@ static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma, #ifdef CONFIG_MIGRATION static inline bool is_pinnable_page(struct page *page) { - return !(is_zone_movable_page(page) || is_migrate_cma_page(page)) || - is_zero_pfn(page_to_pfn(page)); + int mt = get_pageblock_migratetype(page); + + return !(is_zone_movable_page(page) || mt == MIGRATE_CMA || + mt == MIGRATE_ISOLATE || is_zero_pfn(page_to_pfn(page))); } #else static inline bool is_pinnable_page(struct page *page)
Pages on CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA so current is_pinnable_page could miss CMA pages which has MIGRATE_ ISOLATE. It ends up putting CMA pages longterm pinning possible on pin_user_pages APIs so CMA allocation fails. The CMA allocation path protects the migration type change race using zone->lock but what GUP path need to know is just whether the page is on CMA area or not rather than exact type. Thus, we don't need zone->lock but just checks the migratype in either of (MIGRATE_ISOLATE and MIGRATE_CMA). Signed-off-by: Minchan Kim <minchan@kernel.org> --- include/linux/mm.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)