Message ID | 20241015164106.465253-1-kernel@pankajraghav.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v14] mm: don't set readahead flag on a folio when lookahead_size > nr_to_read | expand |
Oops. This is v1 not v14. -- Pankaj
On Tue, Oct 15, 2024 at 06:41:06PM +0200, Pankaj Raghav (Samsung) wrote: v14? Where are v1..13 of this patch? It's the first time I've seen it. > The readahead flag is set on a folio based on the lookahead_size and > nr_to_read. For example, when the readahead happens from index to index > + nr_to_read, then the readahead `mark` offset from index is set at > nr_to_read - lookahead_size. > > There are some scenarios where the lookahead_size > nr_to_read. If this > happens, readahead flag is not set on any folio on the current > readahead window. Please describe those scenarios, as that's the important bit. > There are two problems at the moment in the way `mark` is calculated > when lookahead_size > nr_to_read: > > - unsigned long `mark` will be assigned a negative value which can lead > to unexpected results in extreme cases due to wrap around. Can such an extreme case happen? > - The current calculation for `mark` with mapping_min_order > 0 gives > incorrect results when lookahead_size > nr_to_read due to rounding > up operation. > > Explicitly initialize `mark` to be ULONG_MAX and only calculate it > when lookahead_size is within the readahead window. You haven't really spelled out the consequences of this properly. Perhaps a worked example would help. I think the worst case scenario is that we set the flag on the wrong folio, causing readahead to occur when it should not. But maybe I'm wrong?
On Tue, Oct 15, 2024 at 06:33:11PM +0100, Matthew Wilcox wrote: > On Tue, Oct 15, 2024 at 06:41:06PM +0200, Pankaj Raghav (Samsung) wrote: > > v14? Where are v1..13 of this patch? It's the first time I've seen it. Sorry for the confusion. My git send script messed up the version number. It is v1 :) > > > The readahead flag is set on a folio based on the lookahead_size and > > nr_to_read. For example, when the readahead happens from index to index > > + nr_to_read, then the readahead `mark` offset from index is set at > > nr_to_read - lookahead_size. > > > > There are some scenarios where the lookahead_size > nr_to_read. If this > > happens, readahead flag is not set on any folio on the current > > readahead window. > > Please describe those scenarios, as that's the important bit. > Yes. I will do that in the next version. do_page_cache_ra() can clamp the nr_to_read if the readahead window extends beyond EOF. I think this probably happens when readahead window was created and the file was truncated before the readahead starts. > > There are two problems at the moment in the way `mark` is calculated > > when lookahead_size > nr_to_read: > > > > - unsigned long `mark` will be assigned a negative value which can lead > > to unexpected results in extreme cases due to wrap around. > > Can such an extreme case happen? > I think this is highly unlikely. I will probably remove this reason from the commit message. It was just a bit weird to me that we are assigning a negative number to an unsigned value which is supposed to be the offset. > > - The current calculation for `mark` with mapping_min_order > 0 gives > > incorrect results when lookahead_size > nr_to_read due to rounding > > up operation. > > > > Explicitly initialize `mark` to be ULONG_MAX and only calculate it > > when lookahead_size is within the readahead window. > > You haven't really spelled out the consequences of this properly. > Perhaps a worked example would help. > Got it. I saw this while running generic/476 on XFS with 64k block size. Let's assume the following values: index = 128 nr_to_read = 16 lookahead_size = 28 mapping_min_order = 4 (16 pages) The lookahead_size is actually lying outside the current readahead window. The calculation without this patch will result in incorrect mark as follows: ra_folio_index = round_up(128 + 16 - 28, 16) = 128; mark = 128 - 128 = 0; So we will be marking the folio on 0th index with RA flag, even though we shouldn't have. Does that make sense? > I think the worst case scenario is that we set the flag on the wrong > folio, causing readahead to occur when it should not. But maybe I'm > wrong? You are right. We might unnecessarily trigger a readahead where we should not. I think it is good to mention this consequence as well in the commit message to be clear. -- Pankaj
On Wed, Oct 16, 2024 at 03:35:27PM +0530, Pankaj Raghav (Samsung) wrote: > > > - The current calculation for `mark` with mapping_min_order > 0 gives > > > incorrect results when lookahead_size > nr_to_read due to rounding > > > up operation. > > > > > > Explicitly initialize `mark` to be ULONG_MAX and only calculate it > > > when lookahead_size is within the readahead window. > > > > You haven't really spelled out the consequences of this properly. > > Perhaps a worked example would help. > > Got it. I saw this while running generic/476 on XFS with 64k block size. > > Let's assume the following values: > index = 128 > nr_to_read = 16 > lookahead_size = 28 > mapping_min_order = 4 (16 pages) > > The lookahead_size is actually lying outside the current readahead > window. The calculation without this patch will result in incorrect mark > as follows: > > ra_folio_index = round_up(128 + 16 - 28, 16) = 128; > mark = 128 - 128 = 0; > > So we will be marking the folio on 0th index with RA flag, even though > we shouldn't have. Does that make sense? But we don't go back and find the folio for index 0. We only consider the folios we're actually reading for marking. So if 'mark' lies outside the readahead window, we simply won't mark any of them. So I don't think your patch changes anything. Or did I miss something?
On Wed, Oct 16, 2024 at 12:57:44PM +0100, Matthew Wilcox wrote: > On Wed, Oct 16, 2024 at 03:35:27PM +0530, Pankaj Raghav (Samsung) wrote: > > > > - The current calculation for `mark` with mapping_min_order > 0 gives > > > > incorrect results when lookahead_size > nr_to_read due to rounding > > > > up operation. > > > > > > > > Explicitly initialize `mark` to be ULONG_MAX and only calculate it > > > > when lookahead_size is within the readahead window. > > > > > > You haven't really spelled out the consequences of this properly. > > > Perhaps a worked example would help. > > > > Got it. I saw this while running generic/476 on XFS with 64k block size. > > > > Let's assume the following values: > > index = 128 > > nr_to_read = 16 > > lookahead_size = 28 > > mapping_min_order = 4 (16 pages) > > > > The lookahead_size is actually lying outside the current readahead > > window. The calculation without this patch will result in incorrect mark > > as follows: > > > > ra_folio_index = round_up(128 + 16 - 28, 16) = 128; > > mark = 128 - 128 = 0; > > > > So we will be marking the folio on 0th index with RA flag, even though Oops. I shouldn't have said 0th index. I meant at offset 0 from the index. > > we shouldn't have. Does that make sense? > > But we don't go back and find the folio for index 0. We only consider > the folios we're actually reading for marking. So if 'mark' lies > outside the readahead window, we simply won't mark any of them. So I `mark` is the offset from index. So we compare `mark` with the iterator `i`, which starts at 0. So we will set the RA flag on index 128 in this example, which is not correct. if (i == mark) folio_set_readahead(folio); With this patch, mark will be ULONG_MAX and not 0. > don't think your patch changes anything. Or did I miss something? Does that make sense?
diff --git a/mm/readahead.c b/mm/readahead.c index 3dc6c7a128dd..475d2940a1ed 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -206,9 +206,9 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, unsigned long nr_to_read, unsigned long lookahead_size) { struct address_space *mapping = ractl->mapping; - unsigned long ra_folio_index, index = readahead_index(ractl); + unsigned long index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); - unsigned long mark, i = 0; + unsigned long mark = ULONG_MAX, i = 0; unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); /* @@ -232,9 +232,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * index that only has lookahead or "async_region" to set the * readahead flag. */ - ra_folio_index = round_up(readahead_index(ractl) + nr_to_read - lookahead_size, - min_nrpages); - mark = ra_folio_index - index; + if (lookahead_size <= nr_to_read) { + unsigned long ra_folio_index; + + ra_folio_index = round_up(readahead_index(ractl) + + nr_to_read - lookahead_size, + min_nrpages); + mark = ra_folio_index - index; + } nr_to_read += readahead_index(ractl) - index; ractl->_index = index;