Message ID | 20201217185243.3288048-10-pasha.tatashin@soleen.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | prohibit pinning pages in ZONE_MOVABLE | expand |
On 12/17/20 10:52 AM, Pavel Tatashin wrote: > In gup_test both gup_flags and test_flags use the same flags field. > This is broken, because gup_flags can be passed as raw value (via -F hex), > which can overwrite all the test flags. Thanks for finding and fixing the "stuck at 0x1" bug! The test is not quite as broken as you think, though, because you're not looking at the history and intent of the FOLL_WRITE, and how that is used. And that is leading you to make wider "fixes" than I'd recommend. The new -F command line switch is the latest, newest (and therefore flakiest) addition, and *that* part is worth fixing up, because in addition to being stuck at 0x1 as you noticed, it also misleads everyone into thinking that it's intended for general gup flags. It's not. It really was meant to be a subtest control flag (controlling the behavior of the dump pages sub-test), as you can see in the commit log and documentation: For example: ./gup_test -ct -F 1 0 19 0x1000 Meaning: -c: dump pages sub-test -t: use THP pages -F 1: use pin_user_pages() instead of get_user_pages() 0 19 0x1000: dump pages 0, 19, and 4096 > > Farther, in the actual gup_test.c all the passed gup_flags are erased and > unconditionally replaced with FOLL_WRITE. This is still desirable; Kirill's original intention of only allowing FOLL_WRITE or nothing is still what we want. There is no desire to pass in general gup_flags. > > Which means that test_flags are ignored, and code like this always > performs pin dump test: > > 155 if (gup->flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN) > 156 nr = pin_user_pages(addr, nr, gup->flags, > 157 pages + i, NULL); > 158 else > 159 nr = get_user_pages(addr, nr, gup->flags, > 160 pages + i, NULL); > 161 break; > > Add a new test_flags field, to allow raw gup_flags to work. I think .test_control_flags field would be a good name, to make it very clear that it's not destined for gup_flags. Just .test_flags is not quite as clear a distinction from .gup_flags, as .test_control_flags is, IMHO. > Add a new subcommand for DUMP_USER_PAGES_TEST to specify that pin test > should be performed. It's arguably better to just keep using the -c option instead, plus the new .test_control_flags field. Otherwise you get a multiplication of possibilities (and therefore, of command line options). > Remove unconditional overwriting of gup_flags via FOLL_WRITE. But, Nope, let's not do that. > preserve the previous behaviour where FOLL_WRITE was the default flag, > and add a new option "-W" to unset FOLL_WRITE. Or that either. > > With the fix, dump works like this: > > root@virtme:/# gup_test -c > ---- page #0, starting from user virt addr: 0x7f8acb9e4000 > page:00000000d3d2ee27 refcount:2 mapcount:1 mapping:0000000000000000 > index:0x0 pfn:0x100bcf > anon flags: 0x300000000080016(referenced|uptodate|lru|swapbacked) > raw: 0300000000080016 ffffd0e204021608 ffffd0e208df2e88 ffff8ea04243ec61 > raw: 0000000000000000 0000000000000000 0000000200000000 0000000000000000 > page dumped because: gup_test: dump_pages() test > DUMP_USER_PAGES_TEST: done > > root@virtme:/# gup_test -c -p > ---- page #0, starting from user virt addr: 0x7fd19701b000 > page:00000000baed3c7d refcount:1025 mapcount:1 mapping:0000000000000000 > index:0x0 pfn:0x108008 > anon flags: 0x300000000080014(uptodate|lru|swapbacked) > raw: 0300000000080014 ffffd0e204200188 ffffd0e205e09088 ffff8ea04243ee71 > raw: 0000000000000000 0000000000000000 0000040100000000 0000000000000000 > page dumped because: gup_test: dump_pages() test > DUMP_USER_PAGES_TEST: done > > Refcount shows the difference between pin vs no-pin case. > Also change type of nr from int to long, as it counts number of pages. Why? Is there a bug? gup only handles int sized amounts of pages so this is just adding to the churn, yes? > > Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> > --- > mm/gup_test.c | 9 +++------ > mm/gup_test.h | 1 + > tools/testing/selftests/vm/gup_test.c | 11 +++++++++-- > 3 files changed, 13 insertions(+), 8 deletions(-) > > diff --git a/mm/gup_test.c b/mm/gup_test.c > index e3cf78e5873e..24c70c5814ba 100644 > --- a/mm/gup_test.c > +++ b/mm/gup_test.c > @@ -94,7 +94,7 @@ static int __gup_test_ioctl(unsigned int cmd, > { > ktime_t start_time, end_time; > unsigned long i, nr_pages, addr, next; > - int nr; > + long nr; > struct page **pages; > int ret = 0; > bool needs_mmap_lock = > @@ -126,9 +126,6 @@ static int __gup_test_ioctl(unsigned int cmd, > nr = (next - addr) / PAGE_SIZE; > } > > - /* Filter out most gup flags: only allow a tiny subset here: */ > - gup->flags &= FOLL_WRITE; > - > switch (cmd) { > case GUP_FAST_BENCHMARK: > nr = get_user_pages_fast(addr, nr, gup->flags, > @@ -152,7 +149,7 @@ static int __gup_test_ioctl(unsigned int cmd, > pages + i, NULL); > break; > case DUMP_USER_PAGES_TEST: > - if (gup->flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN) > + if (gup->test_flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN) > nr = pin_user_pages(addr, nr, gup->flags, > pages + i, NULL); > else > @@ -187,7 +184,7 @@ static int __gup_test_ioctl(unsigned int cmd, > > start_time = ktime_get(); > > - put_back_pages(cmd, pages, nr_pages, gup->flags); > + put_back_pages(cmd, pages, nr_pages, gup->test_flags); > > end_time = ktime_get(); > gup->put_delta_usec = ktime_us_delta(end_time, start_time); > diff --git a/mm/gup_test.h b/mm/gup_test.h > index 90a6713d50eb..b1b376b5d3b2 100644 > --- a/mm/gup_test.h > +++ b/mm/gup_test.h > @@ -22,6 +22,7 @@ struct gup_test { > __u64 size; > __u32 nr_pages_per_call; > __u32 flags; > + __u32 test_flags; > /* > * Each non-zero entry is the number of the page (1-based: first page is > * page 1, so that zero entries mean "do nothing") from the .addr base. > diff --git a/tools/testing/selftests/vm/gup_test.c b/tools/testing/selftests/vm/gup_test.c > index 6c6336dd3b7f..42c71483729f 100644 > --- a/tools/testing/selftests/vm/gup_test.c > +++ b/tools/testing/selftests/vm/gup_test.c > @@ -37,13 +37,13 @@ int main(int argc, char **argv) > { > struct gup_test gup = { 0 }; > unsigned long size = 128 * MB; > - int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 0; > + int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 1; Let's leave that alone, as noted above. > unsigned long cmd = GUP_FAST_BENCHMARK; > int flags = MAP_PRIVATE; > char *file = "/dev/zero"; > char *p; > > - while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwSH")) != -1) { > + while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHp")) != -1) { And also as noted above I don't think we need this or any of the remaining hunks below. > switch (opt) { > case 'a': > cmd = PIN_FAST_BENCHMARK; > @@ -65,6 +65,10 @@ int main(int argc, char **argv) > */ > gup.which_pages[0] = 1; > break; > + case 'p': > + /* works only with DUMP_USER_PAGES_TEST */ > + gup.test_flags |= GUP_TEST_FLAG_DUMP_PAGES_USE_PIN; > + break; > case 'F': > /* strtol, so you can pass flags in hex form */ > gup.flags = strtol(optarg, 0, 0); > @@ -93,6 +97,9 @@ int main(int argc, char **argv) > case 'w': > write = 1; > break; > + case 'W': > + write = 0; > + break; > case 'f': > file = optarg; > break; > thanks,
On 12/18/20 1:06 AM, John Hubbard wrote: >> Add a new test_flags field, to allow raw gup_flags to work. > > I think .test_control_flags field would be a good name, to make it very > clear that it's not destined for gup_flags. Just .test_flags is not quite > as clear a distinction from .gup_flags, as .test_control_flags is, IMHO. > And maybe renaming .flags to .gup_flags, just to make it really clear. thanks,
diff --git a/mm/gup_test.c b/mm/gup_test.c index e3cf78e5873e..24c70c5814ba 100644 --- a/mm/gup_test.c +++ b/mm/gup_test.c @@ -94,7 +94,7 @@ static int __gup_test_ioctl(unsigned int cmd, { ktime_t start_time, end_time; unsigned long i, nr_pages, addr, next; - int nr; + long nr; struct page **pages; int ret = 0; bool needs_mmap_lock = @@ -126,9 +126,6 @@ static int __gup_test_ioctl(unsigned int cmd, nr = (next - addr) / PAGE_SIZE; } - /* Filter out most gup flags: only allow a tiny subset here: */ - gup->flags &= FOLL_WRITE; - switch (cmd) { case GUP_FAST_BENCHMARK: nr = get_user_pages_fast(addr, nr, gup->flags, @@ -152,7 +149,7 @@ static int __gup_test_ioctl(unsigned int cmd, pages + i, NULL); break; case DUMP_USER_PAGES_TEST: - if (gup->flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN) + if (gup->test_flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN) nr = pin_user_pages(addr, nr, gup->flags, pages + i, NULL); else @@ -187,7 +184,7 @@ static int __gup_test_ioctl(unsigned int cmd, start_time = ktime_get(); - put_back_pages(cmd, pages, nr_pages, gup->flags); + put_back_pages(cmd, pages, nr_pages, gup->test_flags); end_time = ktime_get(); gup->put_delta_usec = ktime_us_delta(end_time, start_time); diff --git a/mm/gup_test.h b/mm/gup_test.h index 90a6713d50eb..b1b376b5d3b2 100644 --- a/mm/gup_test.h +++ b/mm/gup_test.h @@ -22,6 +22,7 @@ struct gup_test { __u64 size; __u32 nr_pages_per_call; __u32 flags; + __u32 test_flags; /* * Each non-zero entry is the number of the page (1-based: first page is * page 1, so that zero entries mean "do nothing") from the .addr base. diff --git a/tools/testing/selftests/vm/gup_test.c b/tools/testing/selftests/vm/gup_test.c index 6c6336dd3b7f..42c71483729f 100644 --- a/tools/testing/selftests/vm/gup_test.c +++ b/tools/testing/selftests/vm/gup_test.c @@ -37,13 +37,13 @@ int main(int argc, char **argv) { struct gup_test gup = { 0 }; unsigned long size = 128 * MB; - int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 0; + int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 1; unsigned long cmd = GUP_FAST_BENCHMARK; int flags = MAP_PRIVATE; char *file = "/dev/zero"; char *p; - while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwSH")) != -1) { + while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHp")) != -1) { switch (opt) { case 'a': cmd = PIN_FAST_BENCHMARK; @@ -65,6 +65,10 @@ int main(int argc, char **argv) */ gup.which_pages[0] = 1; break; + case 'p': + /* works only with DUMP_USER_PAGES_TEST */ + gup.test_flags |= GUP_TEST_FLAG_DUMP_PAGES_USE_PIN; + break; case 'F': /* strtol, so you can pass flags in hex form */ gup.flags = strtol(optarg, 0, 0); @@ -93,6 +97,9 @@ int main(int argc, char **argv) case 'w': write = 1; break; + case 'W': + write = 0; + break; case 'f': file = optarg; break;
In gup_test both gup_flags and test_flags use the same flags field. This is broken, because gup_flags can be passed as raw value (via -F hex), which can overwrite all the test flags. Farther, in the actual gup_test.c all the passed gup_flags are erased and unconditionally replaced with FOLL_WRITE. Which means that test_flags are ignored, and code like this always performs pin dump test: 155 if (gup->flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN) 156 nr = pin_user_pages(addr, nr, gup->flags, 157 pages + i, NULL); 158 else 159 nr = get_user_pages(addr, nr, gup->flags, 160 pages + i, NULL); 161 break; Add a new test_flags field, to allow raw gup_flags to work. Add a new subcommand for DUMP_USER_PAGES_TEST to specify that pin test should be performed. Remove unconditional overwriting of gup_flags via FOLL_WRITE. But, preserve the previous behaviour where FOLL_WRITE was the default flag, and add a new option "-W" to unset FOLL_WRITE. With the fix, dump works like this: root@virtme:/# gup_test -c ---- page #0, starting from user virt addr: 0x7f8acb9e4000 page:00000000d3d2ee27 refcount:2 mapcount:1 mapping:0000000000000000 index:0x0 pfn:0x100bcf anon flags: 0x300000000080016(referenced|uptodate|lru|swapbacked) raw: 0300000000080016 ffffd0e204021608 ffffd0e208df2e88 ffff8ea04243ec61 raw: 0000000000000000 0000000000000000 0000000200000000 0000000000000000 page dumped because: gup_test: dump_pages() test DUMP_USER_PAGES_TEST: done root@virtme:/# gup_test -c -p ---- page #0, starting from user virt addr: 0x7fd19701b000 page:00000000baed3c7d refcount:1025 mapcount:1 mapping:0000000000000000 index:0x0 pfn:0x108008 anon flags: 0x300000000080014(uptodate|lru|swapbacked) raw: 0300000000080014 ffffd0e204200188 ffffd0e205e09088 ffff8ea04243ee71 raw: 0000000000000000 0000000000000000 0000040100000000 0000000000000000 page dumped because: gup_test: dump_pages() test DUMP_USER_PAGES_TEST: done Refcount shows the difference between pin vs no-pin case. Also change type of nr from int to long, as it counts number of pages. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> --- mm/gup_test.c | 9 +++------ mm/gup_test.h | 1 + tools/testing/selftests/vm/gup_test.c | 11 +++++++++-- 3 files changed, 13 insertions(+), 8 deletions(-)