Message ID | PUZPR04MB63168A32AB45E8924B52CBC2817B2@PUZPR04MB6316.apcprd04.prod.outlook.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | exfat: fix file not locking when writing zeros in exfat_file_mmap() | expand |
On Wed, Jan 24, 2024 at 05:00:37AM +0000, Yuezhang.Mo@sony.com wrote: > inode->i_rwsem should be locked when writing file. But the lock > is missing when writing zeros to the file in exfat_file_mmap(). This is actually very weird behaviour in exfat. This kind of "I must manipulate the on-disc layout" is not generally done in mmap(), it's done in ->page_mkwrite() or even delayed until we actually do writeback. Why does exfat do this?
From: Matthew Wilcox <willy@infradead.org> Sent: Wednesday, January 24, 2024 1:21 PM To: Mo, Yuezhang <Yuezhang.Mo@sony.com> Subject: Re: [PATCH] exfat: fix file not locking when writing zeros in exfat_file_mmap() > On Wed, Jan 24, 2024 at 05:00:37AM +0000, mailto:Yuezhang.Mo@sony.com wrote: > > inode->i_rwsem should be locked when writing file. But the lock > > is missing when writing zeros to the file in exfat_file_mmap(). > > This is actually very weird behaviour in exfat. This kind of "I must > manipulate the on-disc layout" is not generally done in mmap(), it's > done in ->page_mkwrite() or even delayed until we actually do writeback. > Why does exfat do this? In exfat, "valid_size" describes how far into the data stream user data has been written and "size" describes the file size. Return zeros if read "valid_size"~"size". For example, (1) xfs_io -t -f -c "pwrite -S 0x59 0 1024" $filename - Write 0x59 to 0~1023 - both "size" and "valid_size" are 1024 (2) xfs_io -t -f -c "truncate 4K" $filename - "valid_size" is still 1024 - "size" is changed to 4096 - 1024~4095 is not zeroed - return zeros if read 1024~4095 (3) xfs_io -t -f -c "mmap -rw 0 3072" -c "mwrite -S 0x5a 2048 512" $filename (3.1) "mmap -rw 0 3072" - write zeros to 1024~3071 - "valid_size" is changed to 3072 - "size" is still 4096 (3.2) "mwrite -S 0x5a 2048 512" - write 0x5a to 2048~2559 - "valid_size" is still 3072 - "size" is still 4096 To avoid 1024~2047 is not zeroed and no need to update "valid_size" in (3.2), I zero 1024~3071 in (3.1). If you have a better solution, welcome to contribute to exfat or share your solution in detail.
On Wed, Jan 24, 2024 at 10:05:15AM +0000, Yuezhang.Mo@sony.com wrote: > From: Matthew Wilcox <willy@infradead.org> > Sent: Wednesday, January 24, 2024 1:21 PM > To: Mo, Yuezhang <Yuezhang.Mo@sony.com> > Subject: Re: [PATCH] exfat: fix file not locking when writing zeros in exfat_file_mmap() > > On Wed, Jan 24, 2024 at 05:00:37AM +0000, mailto:Yuezhang.Mo@sony.com wrote: > > > inode->i_rwsem should be locked when writing file. But the lock > > > is missing when writing zeros to the file in exfat_file_mmap(). > > > > This is actually very weird behaviour in exfat. This kind of "I must > > manipulate the on-disc layout" is not generally done in mmap(), it's > > done in ->page_mkwrite() or even delayed until we actually do writeback. > > Why does exfat do this? > > In exfat, "valid_size" describes how far into the data stream user data has been > written and "size" describes the file size. Return zeros if read "valid_size"~"size". Yes, it's like a very weak implementation of sparse files. There can only be one zero range and it must be at the end (as opposed to most unix derived filesystems which allow arbitrary zero ranges anywhere in the file). > For example, > > (1) xfs_io -t -f -c "pwrite -S 0x59 0 1024" $filename > - Write 0x59 to 0~1023 > - both "size" and "valid_size" are 1024 Yes. > (2) xfs_io -t -f -c "truncate 4K" $filename > - "valid_size" is still 1024 > - "size" is changed to 4096 > - 1024~4095 is not zeroed > - return zeros if read 1024~4095 Yes. > (3) xfs_io -t -f -c "mmap -rw 0 3072" -c "mwrite -S 0x5a 2048 512" $filename > (3.1) "mmap -rw 0 3072" > - write zeros to 1024~3071 > - "valid_size" is changed to 3072 > - "size" is still 4096 That's not what the code says you do. Is this from a trace or were you making up an example? loff_t start = ((loff_t)vma->vm_pgoff << PAGE_SHIFT); loff_t end = min_t(loff_t, i_size_read(inode), start + vma->vm_end - vma->vm_start); ret = exfat_file_zeroed_range(file, ei->valid_size, end); vm_end - vm_start will be 4kB because Linux rounds to PAGE_SIZE even if you ask to map 3072 bytes. In any case, most filesystems do not do this, and I don't understand why exfat does. Just because a file is mmaped writable doesn't mean that we're necessarily going to write to it. The right thing to do here is defer the changing of ->valid_size until you have already written back the new data. > (3.2) "mwrite -S 0x5a 2048 512" > - write 0x5a to 2048~2559 > - "valid_size" is still 3072 > - "size" is still 4096 > > To avoid 1024~2047 is not zeroed and no need to update "valid_size" in (3.2), > I zero 1024~3071 in (3.1). > > If you have a better solution, welcome to contribute to exfat or share your > solution in detail. Update ->valid_size in the writeback path. If I'm reading exfat_get_block() correctly, you already correctly zero pages in the page cache that are read after ->valid_size, so there is very little work for you to do. Oh! I just figured out why you probably do it this way. (1) xfs_io -t -f -c "pwrite -S 0x59 0 1024" $filename (2) xfs_io -t -f -c "truncate 4T" $filename (3) xfs_io -t -f -c "mmap -rw 3T 4096" -c "mwrite -S 0x5a 3T 512" $filename Now (at whatever point you're delaying writing zeroes to) you have to write 3TB of zeroes. And it's probably better to do that at mmap time than at page fault time, so you can still return an error. It's a bit weird to return ENOSPC from mmap, but here we are. It'd be nice to have a comment to explain this. Also, it seems to me that I can write a script that floods the kernel log with: exfat_err(inode->i_sb, "mmap: fail to zero from %llu to %llu(%d)", start, end, ret); That error message should probably be taken out entirely (maybe use a tracepoint if you really want some kind of logging).
On Wed, Jan 24, 2024 at 10:05:15AM +0000, Yuezhang.Mo@sony.com wrote: > From: Matthew Wilcox <willy@infradead.org> > Sent: Wednesday, January 24, 2024 1:21 PM > To: Mo, Yuezhang <Yuezhang.Mo@sony.com> > Subject: Re: [PATCH] exfat: fix file not locking when writing zeros in exfat_file_mmap() > > On Wed, Jan 24, 2024 at 05:00:37AM +0000, mailto:Yuezhang.Mo@sony.com wrote: > > > inode->i_rwsem should be locked when writing file. But the lock > > > is missing when writing zeros to the file in exfat_file_mmap(). > > > > This is actually very weird behaviour in exfat. This kind of "I must > > manipulate the on-disc layout" is not generally done in mmap(), it's > > done in ->page_mkwrite() or even delayed until we actually do writeback. > > Why does exfat do this? > > In exfat, "valid_size" describes how far into the data stream user data has been > written and "size" describes the file size. Return zeros if read "valid_size"~"size". > > For example, > > (1) xfs_io -t -f -c "pwrite -S 0x59 0 1024" $filename > - Write 0x59 to 0~1023 > - both "size" and "valid_size" are 1024 > (2) xfs_io -t -f -c "truncate 4K" $filename > - "valid_size" is still 1024 > - "size" is changed to 4096 > - 1024~4095 is not zeroed I think that's the problem right there. File extension via truncate should really zero the bytes in the page cache in partial pages on file extension (and likley should do it on-disk as well). See iomap_truncate_page(), ext4_block_truncate_page(), etc. Leaving the zeroing until someone actually accesses the data leads to complexity in the IO path to handle this corner case and getting that wrong leads directly to data corruption bugs. Just zero the data in the operation that exposes that data range as zeros to the user. -Dave.
2024-01-25 6:35 GMT+09:00, Dave Chinner <david@fromorbit.com>: > On Wed, Jan 24, 2024 at 10:05:15AM +0000, Yuezhang.Mo@sony.com wrote: >> From: Matthew Wilcox <willy@infradead.org> >> Sent: Wednesday, January 24, 2024 1:21 PM >> To: Mo, Yuezhang <Yuezhang.Mo@sony.com> >> Subject: Re: [PATCH] exfat: fix file not locking when writing zeros in >> exfat_file_mmap() >> > On Wed, Jan 24, 2024 at 05:00:37AM +0000, mailto:Yuezhang.Mo@sony.com >> > wrote: >> > > inode->i_rwsem should be locked when writing file. But the lock >> > > is missing when writing zeros to the file in exfat_file_mmap(). >> > >> > This is actually very weird behaviour in exfat. This kind of "I must >> > manipulate the on-disc layout" is not generally done in mmap(), it's >> > done in ->page_mkwrite() or even delayed until we actually do >> > writeback. >> > Why does exfat do this? >> >> In exfat, "valid_size" describes how far into the data stream user data >> has been >> written and "size" describes the file size. Return zeros if read >> "valid_size"~"size". >> >> For example, >> >> (1) xfs_io -t -f -c "pwrite -S 0x59 0 1024" $filename >> - Write 0x59 to 0~1023 >> - both "size" and "valid_size" are 1024 >> (2) xfs_io -t -f -c "truncate 4K" $filename >> - "valid_size" is still 1024 >> - "size" is changed to 4096 >> - 1024~4095 is not zeroed > > I think that's the problem right there. File extension via truncate > should really zero the bytes in the page cache in partial pages on > file extension (and likley should do it on-disk as well). See > iomap_truncate_page(), ext4_block_truncate_page(), etc. > > Leaving the zeroing until someone actually accesses the data leads > to complexity in the IO path to handle this corner case and getting > that wrong leads directly to data corruption bugs. Just zero the > data in the operation that exposes that data range as zeros to the > user. We need to consider the case that mmap against files with different valid size and size created from Windows. So it needed to zero out in mmap. We tried to improve this after receiving a report of a compatibility issue with linux-exfat, where the two file sizes are set differently from Windows. https://github.com/exfatprogs/exfatprogs/issues/213 Yue referred to mmap code of ntfs3 that has valid-size like exfat and had handled it in mmap. Thanks. > > -Dave. > -- > Dave Chinner > david@fromorbit.com >
On Thu, Jan 25, 2024 at 07:19:45PM +0900, Namjae Jeon wrote: > 2024-01-25 6:35 GMT+09:00, Dave Chinner <david@fromorbit.com>: > > On Wed, Jan 24, 2024 at 10:05:15AM +0000, Yuezhang.Mo@sony.com wrote: > >> From: Matthew Wilcox <willy@infradead.org> > >> Sent: Wednesday, January 24, 2024 1:21 PM > >> To: Mo, Yuezhang <Yuezhang.Mo@sony.com> > >> Subject: Re: [PATCH] exfat: fix file not locking when writing zeros in > >> exfat_file_mmap() > >> > On Wed, Jan 24, 2024 at 05:00:37AM +0000, mailto:Yuezhang.Mo@sony.com > >> > wrote: > >> > > inode->i_rwsem should be locked when writing file. But the lock > >> > > is missing when writing zeros to the file in exfat_file_mmap(). > >> > > >> > This is actually very weird behaviour in exfat. This kind of "I must > >> > manipulate the on-disc layout" is not generally done in mmap(), it's > >> > done in ->page_mkwrite() or even delayed until we actually do > >> > writeback. > >> > Why does exfat do this? > >> > >> In exfat, "valid_size" describes how far into the data stream user data > >> has been > >> written and "size" describes the file size. Return zeros if read > >> "valid_size"~"size". > >> > >> For example, > >> > >> (1) xfs_io -t -f -c "pwrite -S 0x59 0 1024" $filename > >> - Write 0x59 to 0~1023 > >> - both "size" and "valid_size" are 1024 > >> (2) xfs_io -t -f -c "truncate 4K" $filename > >> - "valid_size" is still 1024 > >> - "size" is changed to 4096 > >> - 1024~4095 is not zeroed > > > > I think that's the problem right there. File extension via truncate > > should really zero the bytes in the page cache in partial pages on > > file extension (and likley should do it on-disk as well). See > > iomap_truncate_page(), ext4_block_truncate_page(), etc. > > > > Leaving the zeroing until someone actually accesses the data leads > > to complexity in the IO path to handle this corner case and getting > > that wrong leads directly to data corruption bugs. Just zero the > > data in the operation that exposes that data range as zeros to the > > user. > We need to consider the case that mmap against files with different > valid size and size created from Windows. So it needed to zero out in mmap. That's a different case - that's a "read from a hole" case, not a "extending truncate" case. i.e. the range from 'valid size' to EOF is a range where no data has been written and so contains zeros. It is equivalent to either a hole in the file (no backing store) or an unwritten range (backing store instantiated but marked as containing no valid data). When we consider this range as "reading from a hole/unwritten range", it should become obvious the correct way to handle this case is the same as every other filesystem that supports holes and/or unwritten extents: the page cache page gets zeroed in the readahead/readpage paths when it maps to a hole/unwritten range in the file. There's no special locking needed if it is done this way, and there's no need for special hooks anywhere to zero data beyond valid size because it is already guaranteed to be zeroed in memory if the range is cached in the page cache..... > We tried to improve this after receiving a report of a compatibility > issue with linux-exfat, where the two file sizes are set differently > from Windows. > > https://github.com/exfatprogs/exfatprogs/issues/213 > > Yue referred to mmap code of ntfs3 that has valid-size like exfat and > had handled it in mmap. Saying "but someone else did the same thing" doesn't make it the right thing to do. It just means someone else has already done it the wrong way, and it wasn't noticed during review. :/ -Dave.
On Fri, Jan 26, 2024 at 12:22:32PM +1100, Dave Chinner wrote: > On Thu, Jan 25, 2024 at 07:19:45PM +0900, Namjae Jeon wrote: > > We need to consider the case that mmap against files with different > > valid size and size created from Windows. So it needed to zero out in mmap. > > That's a different case - that's a "read from a hole" case, not a > "extending truncate" case. i.e. the range from 'valid size' to EOF > is a range where no data has been written and so contains zeros. > It is equivalent to either a hole in the file (no backing store) or > an unwritten range (backing store instantiated but marked as > containing no valid data). > > When we consider this range as "reading from a hole/unwritten > range", it should become obvious the correct way to handle this case > is the same as every other filesystem that supports holes and/or > unwritten extents: the page cache page gets zeroed in the > readahead/readpage paths when it maps to a hole/unwritten range in > the file. > > There's no special locking needed if it is done this way, and > there's no need for special hooks anywhere to zero data beyond valid > size because it is already guaranteed to be zeroed in memory if the > range is cached in the page cache..... but the problem is that Microsoft half-arsed their support for holes. See my other mail in this thread. truncate the file up to 4TB write a byte at offset 3TB ... now we have to stream 3TB of zeroes through the page cache so that we can write the byte at 3TB.
> From: Matthew Wilcox <mailto:willy@infradead.org> > Sent: Wednesday, January 24, 2024 10:03 PM > To: Mo, Yuezhang <mailto:Yuezhang.Mo@sony.com> > Cc: mailto:linkinjeon@kernel.org; mailto:sj1557.seo@samsung.com; mailto:linux-fsdevel@vger.kernel.org > Subject: Re: [PATCH] exfat: fix file not locking when writing zeros in exfat_file_mmap() > > > (3) xfs_io -t -f -c "mmap -rw 0 3072" -c "mwrite -S 0x5a 2048 512" $filename > > (3.1) "mmap -rw 0 3072" > > - write zeros to 1024~3071 > > - "valid_size" is changed to 3072 > > - "size" is still 4096 > > That's not what the code says you do. Is this from a trace or were you > making up an example? > > loff_t start = ((loff_t)vma->vm_pgoff << PAGE_SHIFT); > loff_t end = min_t(loff_t, i_size_read(inode), > start + vma->vm_end - vma->vm_start); > ret = exfat_file_zeroed_range(file, ei->valid_size, end); > > vm_end - vm_start will be 4kB because Linux rounds to PAGE_SIZE even if > you ask to map 3072 bytes. I just gave an example to explain why there is an operation of writing zeros in exfat_file_mmap(). Yes, it's better to explain it with traced data, the data in my example confuses you. > Update ->valid_size in the writeback path. If I'm reading > exfat_get_block() correctly, you already correctly zero pages in the page > cache that are read after ->valid_size, so there is very little work for > you to do. Do you mean to move update valid_size to ->page_mkwrite()? If yes, we need to consider this case. (1) The file size is 12KB (3 pages), valid_size is 0. (2) mmap maps these 3 pages. (3) Only the second page is written. (4) Need to be done in ->page_mkwrite(), (4.1) Mark the second page as dirty(Currently implemented in filemap_page_mkwrite()) (4.2) Mark the first page as dirty(do extra in exfat). (4.3) Update valid_size to the end of the second page(do extra in exfat). Is my understanding correct? If yes, how to do (4.2)? Are there examples of this? > Oh! I just figured out why you probably do it this way. > > (1) xfs_io -t -f -c "pwrite -S 0x59 0 1024" $filename > (2) xfs_io -t -f -c "truncate 4T" $filename > (3) xfs_io -t -f -c "mmap -rw 3T 4096" -c "mwrite -S 0x5a 3T 512" $filename > > Now (at whatever point you're delaying writing zeroes to) you have to > write 3TB of zeroes. From this example, it seems that the current implementation is not good. I referred to mmap code of ntfs3, and ntfs3 also writes zeros in ->mmap. Is the current implementation acceptable as a workaround? > And it's probably better to do that at mmap time > than at page fault time, so you can still return an error. It's a bit > weird to return ENOSPC from mmap, but here we are. exfat_file_zeroed_range() never returns NOSPEC, because exfat_file_zeroed_range() doesn't change the file size(i.e. not allocate new clusters). > > It'd be nice to have a comment to explain this. Also, it seems to me > that I can write a script that floods the kernel log with: > > exfat_err(inode->i_sb, > "mmap: fail to zero from %llu to %llu(%d)", > start, end, ret); > > That error message should probably be taken out entirely (maybe use a > tracepoint if you really want some kind of logging). The error message can be seen directly by developers during development. How about use pr_err_ratelimited() ?
On Fri, Jan 26, 2024 at 02:54:24AM +0000, Matthew Wilcox wrote: > On Fri, Jan 26, 2024 at 12:22:32PM +1100, Dave Chinner wrote: > > On Thu, Jan 25, 2024 at 07:19:45PM +0900, Namjae Jeon wrote: > > > We need to consider the case that mmap against files with different > > > valid size and size created from Windows. So it needed to zero out in mmap. > > > > That's a different case - that's a "read from a hole" case, not a > > "extending truncate" case. i.e. the range from 'valid size' to EOF > > is a range where no data has been written and so contains zeros. > > It is equivalent to either a hole in the file (no backing store) or > > an unwritten range (backing store instantiated but marked as > > containing no valid data). > > > > When we consider this range as "reading from a hole/unwritten > > range", it should become obvious the correct way to handle this case > > is the same as every other filesystem that supports holes and/or > > unwritten extents: the page cache page gets zeroed in the > > readahead/readpage paths when it maps to a hole/unwritten range in > > the file. > > > > There's no special locking needed if it is done this way, and > > there's no need for special hooks anywhere to zero data beyond valid > > size because it is already guaranteed to be zeroed in memory if the > > range is cached in the page cache..... > > but the problem is that Microsoft half-arsed their support for holes. > See my other mail in this thread. Why does that matter? It's exactly the same problem with any other filesytsem that doesn't support sparse files. All I said is that IO operations beyond the "valid size" should be treated like a operating in a hole - I pass no judgement on the filesystem design, implementation or level of sparse file support it has. ALl it needs to do is treat the "not valid" size range as if it was a hole or unwritten, regardless of whether the file is sparse or not.... > truncate the file up to 4TB > write a byte at offset 3TB > > ... now we have to stream 3TB of zeroes through the page cache so that > we can write the byte at 3TB. This behaviour cannot be avoided on filesystems without sparse file support - the hit of writing zeroes has to be taken somewhere. We can handle this in truncate(), the write() path or in ->page_mkwrite *if* the zeroing condition is hit. There's no need to do it at mmap() time if that range of the file is not actually written to by the application... -Dave.
On Sat, Jan 27, 2024 at 09:32:42AM +1100, Dave Chinner wrote: > > but the problem is that Microsoft half-arsed their support for holes. > > See my other mail in this thread. > > Why does that matter? It's exactly the same problem with any other > filesytsem that doesn't support sparse files. > > All I said is that IO operations beyond the "valid size" should > be treated like a operating in a hole - I pass no judgement on the > filesystem design, implementation or level of sparse file support > it has. ALl it needs to do is treat the "not valid" size range as if > it was a hole or unwritten, regardless of whether the file is sparse > or not.... > > > truncate the file up to 4TB > > write a byte at offset 3TB > > > > ... now we have to stream 3TB of zeroes through the page cache so that > > we can write the byte at 3TB. > > This behaviour cannot be avoided on filesystems without sparse file > support - the hit of writing zeroes has to be taken somewhere. We > can handle this in truncate(), the write() path or in ->page_mkwrite > *if* the zeroing condition is hit. There's no need to do it at > mmap() time if that range of the file is not actually written to by > the application... It's really hard to return -ENOSPC from ->page_mkwrite. At best you'll get a SIGBUS or SEGV. So truncate() or mmap() are the only two places to do it. And if we do it in truncate() then we can't take any advantage of the limited "hole" support the filesystem has. Most files are never mmaped, much less mapped writable. I think doing it in mmap() is the best of several bad options.
[Sorry, missed this when you sent it and I think it deserves a response...] On Fri, Jan 26, 2024 at 10:41:14PM +0000, Matthew Wilcox wrote: > On Sat, Jan 27, 2024 at 09:32:42AM +1100, Dave Chinner wrote: > > > but the problem is that Microsoft half-arsed their support for holes. > > > See my other mail in this thread. > > > > Why does that matter? It's exactly the same problem with any other > > filesytsem that doesn't support sparse files. > > > > All I said is that IO operations beyond the "valid size" should > > be treated like a operating in a hole - I pass no judgement on the > > filesystem design, implementation or level of sparse file support > > it has. ALl it needs to do is treat the "not valid" size range as if > > it was a hole or unwritten, regardless of whether the file is sparse > > or not.... > > > > > truncate the file up to 4TB > > > write a byte at offset 3TB > > > > > > ... now we have to stream 3TB of zeroes through the page cache so that > > > we can write the byte at 3TB. > > > > This behaviour cannot be avoided on filesystems without sparse file > > support - the hit of writing zeroes has to be taken somewhere. We > > can handle this in truncate(), the write() path or in ->page_mkwrite > > *if* the zeroing condition is hit. There's no need to do it at > > mmap() time if that range of the file is not actually written to by > > the application... > > It's really hard to return -ENOSPC from ->page_mkwrite. At best you'll > get a SIGBUS or SEGV. So truncate() or mmap() are the only two places to > do it. And if we do it in truncate() then we can't take any advantage > of the limited "hole" support the filesystem has. Yes, but the entire point of ->page-mkwrite was to allow the filesystems to abort the user data modification if they were at ENOSPC when the page fault happened. This is way better than getting into trouble due to space overcommit caused by ignoring ENOSPC situations during page faults. This was a significant problem for XFS users back in the mid-2000s before ->page_mkwrite existed, because both delayed allocation and writes over unwritten extents requiring ENOSPC to be determined at page fault time. It was too late if ENOSPC happened at writeback time or IO completion - this was significant data loss vector and if we tried to prevent data loss by redirtying the pages then we'd lock the machine up because dirty pages could not be cleaned and memory reclaim couldn't make progress... IOWs, it is far better to immediately terminate the application than it is to have silent data loss or completely lock the machine up. Hence the defined semantics of ->page_mkwrite is to send SIGBUS to the application on ENOSPC. It's not an amazing solution but, in reality, it is little different to the OOM killer triggering when memory reclaim declares ENOMEM. We can't allow the operation to continue, and we can't return an error for the application to handle. So we kill it, just like the OOM killer does in the same situation for ENOMEM. We even have an fstest that explicitly exercises this case. i.e. generic/619 creates this mmap() into sparse files situation and then checks that the filesystem really is at ENOSPC when the page fault throws a SIGBUS out because ->page_mkwrite failed due to ENOSPC.... > Most files are never mmaped, much less mapped writable. I think doing > it in mmap() is the best of several bad options. Perhaps so, but that makes it unnecessarily different to the major Linux filesystems.... -Dave.
diff --git a/fs/exfat/file.c b/fs/exfat/file.c index d25a96a148af..473c1641d50d 100644 --- a/fs/exfat/file.c +++ b/fs/exfat/file.c @@ -613,7 +613,11 @@ static int exfat_file_mmap(struct file *file, struct vm_area_struct *vma) start + vma->vm_end - vma->vm_start); if ((vma->vm_flags & VM_WRITE) && ei->valid_size < end) { + if (!inode_trylock(inode)) + return -EAGAIN; + ret = exfat_file_zeroed_range(file, ei->valid_size, end); + inode_unlock(inode); if (ret < 0) { exfat_err(inode->i_sb, "mmap: fail to zero from %llu to %llu(%d)",
inode->i_rwsem should be locked when writing file. But the lock is missing when writing zeros to the file in exfat_file_mmap(). Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength") Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> --- fs/exfat/file.c | 4 ++++ 1 file changed, 4 insertions(+)