mbox series

[0/6] Add generated flag to filesystem struct to block copy_file_range

Message ID 20210212044405.4120619-1-drinkcat@chromium.org (mailing list archive)
Headers show
Series Add generated flag to filesystem struct to block copy_file_range | expand

Message

Nicolas Boichat Feb. 12, 2021, 4:43 a.m. UTC
We hit an issue when upgrading Go compiler from 1.13 to 1.15 [1],
as we use Go's `io.Copy` to copy the content of
`/sys/kernel/debug/tracing/trace` to a temporary file.

Under the hood, Go 1.15 uses `copy_file_range` syscall to
optimize the copy operation. However, that fails to copy any
content when the input file is from tracefs, with an apparent
size of 0 (but there is still content when you `cat` it, of
course).

From discussions in [2][3], it is clear that copy_file_range
cannot be properly implemented on filesystems where the content
is generated at runtime: the file size is incorrect (because it
is unknown before the content is generated), and seeking in such
files (as required by partial writes) is unlikely to work
correctly.

With this patch, Go's `io.Copy` gracefully falls back to a normal
read/write file copy.

I'm not 100% sure which stable tree this should go in, I'd say
at least >=5.3 since this is what introduced support for
cross-filesystem copy_file_range (and where most users are
somewhat likely to hit this issue). But let's discuss the patch
series first.

[1] http://issuetracker.google.com/issues/178332739
[2] https://lkml.org/lkml/2021/1/25/64
[3] https://lkml.org/lkml/2021/1/26/1736


Nicolas Boichat (6):
  fs: Add flag to file_system_type to indicate content is generated
  proc: Add FS_GENERATED_CONTENT to filesystem flags
  sysfs: Add FS_GENERATED_CONTENT to filesystem flags
  debugfs: Add FS_GENERATED_CONTENT to filesystem flags
  tracefs: Add FS_GENERATED_CONTENT to filesystem flags
  vfs: Disallow copy_file_range on generated file systems

 fs/debugfs/inode.c | 1 +
 fs/proc/root.c     | 2 +-
 fs/read_write.c    | 3 +++
 fs/sysfs/mount.c   | 2 +-
 fs/tracefs/inode.c | 1 +
 include/linux/fs.h | 1 +
 6 files changed, 8 insertions(+), 2 deletions(-)

Comments

Al Viro Feb. 14, 2021, 11:09 p.m. UTC | #1
On Fri, Feb 12, 2021 at 12:43:59PM +0800, Nicolas Boichat wrote:
> We hit an issue when upgrading Go compiler from 1.13 to 1.15 [1],
> as we use Go's `io.Copy` to copy the content of
> `/sys/kernel/debug/tracing/trace` to a temporary file.
> 
> Under the hood, Go 1.15 uses `copy_file_range` syscall to
> optimize the copy operation. However, that fails to copy any
> content when the input file is from tracefs, with an apparent
> size of 0 (but there is still content when you `cat` it, of
> course).
> 
> >From discussions in [2][3], it is clear that copy_file_range
> cannot be properly implemented on filesystems where the content
> is generated at runtime: the file size is incorrect (because it
> is unknown before the content is generated), and seeking in such
> files (as required by partial writes) is unlikely to work
> correctly.
> 
> With this patch, Go's `io.Copy` gracefully falls back to a normal
> read/write file copy.
> 
> I'm not 100% sure which stable tree this should go in, I'd say
> at least >=5.3 since this is what introduced support for
> cross-filesystem copy_file_range (and where most users are
> somewhat likely to hit this issue). But let's discuss the patch
> series first.

No.  This is *NOT* an fs-wide flag.  Decision regarding the
usability of copy_file_range() is on per-file basis.

The real constraint is "can freely seek back and expect to
find consistent data".  That is what's violated for seq_file.
And frankly, I would rather add a flag and have seq_open()
(and other suckers, if any) clear it.  With check being
"has both FMODE_PREAD and this new flag".