[RERESEND,v9,0/9] fs: interface for directly reading/writing compressed data

Message ID	cover.1621276134.git.osandov@fb.com (mailing list archive)
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> From: Omar Sandoval <osandov@osandov.com> To: linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>, Christoph Hellwig <hch@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>, Dave Chinner <david@fromorbit.com>, Jann Horn <jannh@google.com>, Amir Goldstein <amir73il@gmail.com>, Aleksa Sarai <cyphar@cyphar.com>, linux-api@vger.kernel.org, kernel-team@fb.com Subject: [PATCH RERESEND v9 0/9] fs: interface for directly reading/writing compressed data Date: Mon, 17 May 2021 11:35:18 -0700 Message-Id: <cover.1621276134.git.osandov@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	fs: interface for directly reading/writing compressed data \| expand [RERESEND,v9,0/9] fs: interface for directly reading/writing compressed data [RERESEND,v9,1/9] iov_iter: add copy_struct_from_iter() [RERESEND,v9,2/9] fs: add O_ALLOW_ENCODED open flag [RERESEND,v9,3/9] fs: add RWF_ENCODED for reading/writing compressed data [RERESEND,v9,4/9] btrfs: don't advance offset for compressed bios in btrfs_csum_one_bio() [RERESEND,v9,5/9] btrfs: add ram_bytes and offset to btrfs_ordered_extent [RERESEND,v9,6/9] btrfs: support different disk extent size for delalloc [RERESEND,v9,7/9] btrfs: optionally extend i_size in cow_file_range_inline() [RERESEND,v9,8/9] btrfs: implement RWF_ENCODED reads [RERESEND,v9,9/9] btrfs: implement RWF_ENCODED writes

Omar Sandoval May 17, 2021, 6:35 p.m. UTC

From: Omar Sandoval <osandov@fb.com>

This series adds an API for reading compressed data on a filesystem
without decompressing it as well as support for writing compressed data
directly to the filesystem. I have test cases (including fsstress
support) and example programs which I'll send up once the dust settles
[1].

The main use-case is Btrfs send/receive: currently, when sending data
from one compressed filesystem to another, the sending side decompresses
the data and the receiving side recompresses it before writing it out.
This is wasteful and can be avoided if we can just send and write
compressed extents. The patches implementing the send/receive support
were sent with the last submission of this series [2].

Patches 1-3 add the VFS support, UAPI, and documentation. Patches 4-7
are Btrfs prep patches. Patch 8 adds Btrfs encoded read support and
patch 9 adds Btrfs encoded write support.

These patches are based on Dave Sterba's Btrfs misc-next branch [3],
which is in turn currently based on v5.13-rc2.

This is a _resend of a resend_ of v9 [4], rebased on the latest
kdave/misc-next branch.

1: https://github.com/osandov/xfstests/tree/rwf-encoded
2: https://lore.kernel.org/linux-btrfs/cover.1615922753.git.osandov@fb.com/
3: https://github.com/kdave/btrfs-devel/tree/misc-next
4: https://lore.kernel.org/linux-fsdevel/cover.1619463858.git.osandov@fb.com/

Omar Sandoval (9):
  iov_iter: add copy_struct_from_iter()
  fs: add O_ALLOW_ENCODED open flag
  fs: add RWF_ENCODED for reading/writing compressed data
  btrfs: don't advance offset for compressed bios in
    btrfs_csum_one_bio()
  btrfs: add ram_bytes and offset to btrfs_ordered_extent
  btrfs: support different disk extent size for delalloc
  btrfs: optionally extend i_size in cow_file_range_inline()
  btrfs: implement RWF_ENCODED reads
  btrfs: implement RWF_ENCODED writes

 Documentation/filesystems/encoded_io.rst | 240 ++++++
 Documentation/filesystems/index.rst      |   1 +
 arch/alpha/include/uapi/asm/fcntl.h      |   1 +
 arch/parisc/include/uapi/asm/fcntl.h     |   1 +
 arch/sparc/include/uapi/asm/fcntl.h      |   1 +
 fs/btrfs/compression.c                   |  12 +-
 fs/btrfs/compression.h                   |   6 +-
 fs/btrfs/ctree.h                         |   9 +-
 fs/btrfs/delalloc-space.c                |  18 +-
 fs/btrfs/file-item.c                     |  35 +-
 fs/btrfs/file.c                          |  46 +-
 fs/btrfs/inode.c                         | 929 +++++++++++++++++++++--
 fs/btrfs/ordered-data.c                  | 124 +--
 fs/btrfs/ordered-data.h                  |  25 +-
 fs/btrfs/relocation.c                    |   4 +-
 fs/fcntl.c                               |  10 +-
 fs/namei.c                               |   4 +
 fs/read_write.c                          | 168 +++-
 include/linux/encoded_io.h               |  17 +
 include/linux/fcntl.h                    |   2 +-
 include/linux/fs.h                       |  13 +
 include/linux/uio.h                      |   1 +
 include/uapi/asm-generic/fcntl.h         |   4 +
 include/uapi/linux/encoded_io.h          |  30 +
 include/uapi/linux/fs.h                  |   5 +-
 lib/iov_iter.c                           |  91 +++
 26 files changed, 1563 insertions(+), 234 deletions(-)
 create mode 100644 Documentation/filesystems/encoded_io.rst
 create mode 100644 include/linux/encoded_io.h
 create mode 100644 include/uapi/linux/encoded_io.h

Linus Torvalds May 17, 2021, 9:32 p.m. UTC | #1

On Mon, May 17, 2021 at 11:35 AM Omar Sandoval <osandov@osandov.com> wrote:
>
> Patches 1-3 add the VFS support, UAPI, and documentation. Patches 4-7
> are Btrfs prep patches. Patch 8 adds Btrfs encoded read support and
> patch 9 adds Btrfs encoded write support.

I don't love the RWF_ENCODED flag, but if that's the way people think
this should be done, as a model this looks reasonable to me.

I'm not sure what the deal with the encryption metadata is. I realize
there is currently only one encryption type ("none") in this series,
but it's not clear how any other encryption type would actually ever
be described. It's not like you can pass in the key (well, I guess
passing in the key would be fine, but passing it back out certainly
would not be).  A key ID from a keyring?

So there's presumably some future plan for it, but it would be good to
verify that that plan makes sense..

                            Linus

Omar Sandoval May 17, 2021, 10:27 p.m. UTC | #2

On Mon, May 17, 2021 at 02:32:47PM -0700, Linus Torvalds wrote:
> On Mon, May 17, 2021 at 11:35 AM Omar Sandoval <osandov@osandov.com> wrote:
> >
> > Patches 1-3 add the VFS support, UAPI, and documentation. Patches 4-7
> > are Btrfs prep patches. Patch 8 adds Btrfs encoded read support and
> > patch 9 adds Btrfs encoded write support.
> 
> I don't love the RWF_ENCODED flag, but if that's the way people think
> this should be done, as a model this looks reasonable to me.
> 
> I'm not sure what the deal with the encryption metadata is. I realize
> there is currently only one encryption type ("none") in this series,
> but it's not clear how any other encryption type would actually ever
> be described. It's not like you can pass in the key (well, I guess
> passing in the key would be fine, but passing it back out certainly
> would not be).  A key ID from a keyring?
> 
> So there's presumably some future plan for it, but it would be good to
> verify that that plan makes sense..

What I'm imagining for fscrypt is:

1. Add ENCODED_IOV_ENCRYPTION_* types for fscrypt. Consumers at least
   need to be able to distinguish between encryption policy versions,
   DIRECT_KEY policies, and IV_INO_LBLK_{64,32} policies, and maybe
   other details.
2. Use RWF_ENCODED only for the data itself.
3. Add new fscrypt ioctls to get and set the encryption key.

The interesting part is (3). If I'm reading the fscrypt documentation
correctly, in the default mode, each file is encrypted with a per-file
key that is a function of the master key for the directory tree and a
per-file nonce.

Userspace manages the master key, we have a FS_IOC_GET_ENCRYPTION_NONCE
ioctl, and the key derivation function is documented. So, userspace
already has all of the pieces it needs to get the encryption key, and
all of the information it needs to decrypt the data it gets from
RWF_ENCODED if it so desires.

On the set/write side, the user can set the same master key and policy
with FS_IOC_SET_ENCRYPTION_POLICY, and we'd need something like an
FS_IOC_SET_ENCRYPTION_NONCE ioctl (possibly with a requirement that it
be set when the file is empty). I think that's it.

The details will vary for the other fscrypt policies, but that's the
gist of it. I added the fscrypt maintainers to correct me if I missed
something.

Thanks,
Omar

Eric Biggers May 17, 2021, 10:48 p.m. UTC | #3

On Mon, May 17, 2021 at 03:27:48PM -0700, Omar Sandoval wrote:
> On Mon, May 17, 2021 at 02:32:47PM -0700, Linus Torvalds wrote:
> > On Mon, May 17, 2021 at 11:35 AM Omar Sandoval <osandov@osandov.com> wrote:
> > >
> > > Patches 1-3 add the VFS support, UAPI, and documentation. Patches 4-7
> > > are Btrfs prep patches. Patch 8 adds Btrfs encoded read support and
> > > patch 9 adds Btrfs encoded write support.
> > 
> > I don't love the RWF_ENCODED flag, but if that's the way people think
> > this should be done, as a model this looks reasonable to me.
> > 
> > I'm not sure what the deal with the encryption metadata is. I realize
> > there is currently only one encryption type ("none") in this series,
> > but it's not clear how any other encryption type would actually ever
> > be described. It's not like you can pass in the key (well, I guess
> > passing in the key would be fine, but passing it back out certainly
> > would not be).  A key ID from a keyring?
> > 
> > So there's presumably some future plan for it, but it would be good to
> > verify that that plan makes sense..
> 
> What I'm imagining for fscrypt is:
> 
> 1. Add ENCODED_IOV_ENCRYPTION_* types for fscrypt. Consumers at least
>    need to be able to distinguish between encryption policy versions,
>    DIRECT_KEY policies, and IV_INO_LBLK_{64,32} policies, and maybe
>    other details.
> 2. Use RWF_ENCODED only for the data itself.
> 3. Add new fscrypt ioctls to get and set the encryption key.
> 
> The interesting part is (3). If I'm reading the fscrypt documentation
> correctly, in the default mode, each file is encrypted with a per-file
> key that is a function of the master key for the directory tree and a
> per-file nonce.
> 
> Userspace manages the master key, we have a FS_IOC_GET_ENCRYPTION_NONCE
> ioctl, and the key derivation function is documented. So, userspace
> already has all of the pieces it needs to get the encryption key, and
> all of the information it needs to decrypt the data it gets from
> RWF_ENCODED if it so desires.
> 
> On the set/write side, the user can set the same master key and policy
> with FS_IOC_SET_ENCRYPTION_POLICY, and we'd need something like an
> FS_IOC_SET_ENCRYPTION_NONCE ioctl (possibly with a requirement that it
> be set when the file is empty). I think that's it.
> 
> The details will vary for the other fscrypt policies, but that's the
> gist of it. I added the fscrypt maintainers to correct me if I missed
> something.
> 

Well, assuming we're talking about regular files only (so file contents
encryption, not filenames encryption), with fscrypt the information needed to
understand a file's encrypted data is the following:

1. The encryption key

2. The filesystem's block size

3. The encryption context:

    struct fscrypt_context_v2 {                                                      
         u8 version; /* FSCRYPT_CONTEXT_V2 */                                     
         u8 contents_encryption_mode;                                             
         u8 filenames_encryption_mode;                                            
         u8 flags;                                                                
         u8 __reserved[4];                                                        
         u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE];                   
         u8 nonce[FSCRYPT_FILE_NONCE_SIZE];                                       
    };                                                                               

   (Or alternatively struct fscrypt_policy_v2 + the nonce field separately;
    that results in the same fields as struct fscrypt_context_v2.)

This is definitely more complex than the compression cases like "the data is a
zlib stream".  So the question is, how much of this metadata (if any) should
actually be passed around during RWF_ENCODED pread/pwrite operations, and how
much should be out-of-band.

I feel like this should be mostly out-of-band (e.g. via the existing ioctls
FS_IOC_{GET,SET}_ENCRYPTION_POLICY), especially given that compression and
encryption could be combined which would make describing the on-disk data even
more difficult.

But I'm not sure what you intended.

- Eric

Omar Sandoval May 17, 2021, 11:25 p.m. UTC | #4

On Mon, May 17, 2021 at 03:48:30PM -0700, Eric Biggers wrote:
> On Mon, May 17, 2021 at 03:27:48PM -0700, Omar Sandoval wrote:
> > On Mon, May 17, 2021 at 02:32:47PM -0700, Linus Torvalds wrote:
> > > On Mon, May 17, 2021 at 11:35 AM Omar Sandoval <osandov@osandov.com> wrote:
> > > >
> > > > Patches 1-3 add the VFS support, UAPI, and documentation. Patches 4-7
> > > > are Btrfs prep patches. Patch 8 adds Btrfs encoded read support and
> > > > patch 9 adds Btrfs encoded write support.
> > > 
> > > I don't love the RWF_ENCODED flag, but if that's the way people think
> > > this should be done, as a model this looks reasonable to me.
> > > 
> > > I'm not sure what the deal with the encryption metadata is. I realize
> > > there is currently only one encryption type ("none") in this series,
> > > but it's not clear how any other encryption type would actually ever
> > > be described. It's not like you can pass in the key (well, I guess
> > > passing in the key would be fine, but passing it back out certainly
> > > would not be).  A key ID from a keyring?
> > > 
> > > So there's presumably some future plan for it, but it would be good to
> > > verify that that plan makes sense..
> > 
> > What I'm imagining for fscrypt is:
> > 
> > 1. Add ENCODED_IOV_ENCRYPTION_* types for fscrypt. Consumers at least
> >    need to be able to distinguish between encryption policy versions,
> >    DIRECT_KEY policies, and IV_INO_LBLK_{64,32} policies, and maybe
> >    other details.
> > 2. Use RWF_ENCODED only for the data itself.
> > 3. Add new fscrypt ioctls to get and set the encryption key.
> > 
> > The interesting part is (3). If I'm reading the fscrypt documentation
> > correctly, in the default mode, each file is encrypted with a per-file
> > key that is a function of the master key for the directory tree and a
> > per-file nonce.
> > 
> > Userspace manages the master key, we have a FS_IOC_GET_ENCRYPTION_NONCE
> > ioctl, and the key derivation function is documented. So, userspace
> > already has all of the pieces it needs to get the encryption key, and
> > all of the information it needs to decrypt the data it gets from
> > RWF_ENCODED if it so desires.
> > 
> > On the set/write side, the user can set the same master key and policy
> > with FS_IOC_SET_ENCRYPTION_POLICY, and we'd need something like an
> > FS_IOC_SET_ENCRYPTION_NONCE ioctl (possibly with a requirement that it
> > be set when the file is empty). I think that's it.
> > 
> > The details will vary for the other fscrypt policies, but that's the
> > gist of it. I added the fscrypt maintainers to correct me if I missed
> > something.
> > 
> 
> Well, assuming we're talking about regular files only (so file contents
> encryption, not filenames encryption),

Yes, I was thinking of regular files. File operations using encrypted
names sounds... interesting, but I think out of scope for this.

> with fscrypt the information needed to
> understand a file's encrypted data is the following:
> 
> 1. The encryption key
> 
> 2. The filesystem's block size
> 
> 3. The encryption context:
> 
>     struct fscrypt_context_v2 {                                                      
>          u8 version; /* FSCRYPT_CONTEXT_V2 */                                     
>          u8 contents_encryption_mode;                                             
>          u8 filenames_encryption_mode;                                            
>          u8 flags;                                                                
>          u8 __reserved[4];                                                        
>          u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE];                   
>          u8 nonce[FSCRYPT_FILE_NONCE_SIZE];                                       
>     };                                                                               
> 
>    (Or alternatively struct fscrypt_policy_v2 + the nonce field separately;
>     that results in the same fields as struct fscrypt_context_v2.)
> 
> This is definitely more complex than the compression cases like "the data is a
> zlib stream".  So the question is, how much of this metadata (if any) should
> actually be passed around during RWF_ENCODED pread/pwrite operations, and how
> much should be out-of-band.
> 
> I feel like this should be mostly out-of-band (e.g. via the existing ioctls
> FS_IOC_{GET,SET}_ENCRYPTION_POLICY), especially given that compression and
> encryption could be combined which would make describing the on-disk data even
> more difficult.
> 
> But I'm not sure what you intended.

Okay, I think we're in agreement: RWF_ENCODED for the data and separate
ioctls for the encryption context. Since the fscrypt policy struct
includes all of the relevant information, RWF_ENCODED can probably just
have a single ENCODED_IOV_ENCRYPTION_FSCRYPT encryption type.
RWF_ENCODED can express data which is both compressed and encrypted, so
that should be fine as well.

The only other missing piece that I see (other than filesystem support)
is an FS_IOC_SET_ENCRYPTION_NONCE ioctl. Would such an interface be
reasonable?

Eric Biggers May 18, 2021, 12:07 a.m. UTC | #5

On Mon, May 17, 2021 at 04:25:15PM -0700, Omar Sandoval wrote:
> 
> Okay, I think we're in agreement: RWF_ENCODED for the data and separate
> ioctls for the encryption context. Since the fscrypt policy struct
> includes all of the relevant information, RWF_ENCODED can probably just
> have a single ENCODED_IOV_ENCRYPTION_FSCRYPT encryption type.
> RWF_ENCODED can express data which is both compressed and encrypted, so
> that should be fine as well.
> 
> The only other missing piece that I see (other than filesystem support)
> is an FS_IOC_SET_ENCRYPTION_NONCE ioctl. Would such an interface be
> reasonable?

In theory, it will be possible to add FS_IOC_SET_ENCRYPTION_NONCE.  The
implementation might be tricky.  It would have to take the inode lock, verify
that the file is empty, replace the encryption xattr, and re-derive and replace
the file's encryption key.  Replacing the key should be safe because the file is
empty, but it's hard to be sure -- and what about directories?  Another concern
is that userspace could misuse this ioctl and somehow end up reusing nonces,
which would be bad; probably this should be a CAP_SYS_ADMIN thing only.

A larger question is whether the goal is to support users backing up and
restoring encrypted files without their encryption key being available -- in
which case things would become *much* harder.  First because of the filenames
encryption, and second because we currently don't allow opening files without
their encryption key.

- Eric

Theodore Ts'o May 18, 2021, 2:53 a.m. UTC | #6

On Mon, May 17, 2021 at 04:25:15PM -0700, Omar Sandoval wrote:
> > Well, assuming we're talking about regular files only (so file contents
> > encryption, not filenames encryption),
> 
> Yes, I was thinking of regular files. File operations using encrypted
> names sounds... interesting, but I think out of scope for this.

So the question I have is why would you want to get the raw encrypted
data?  One possible reason (and this is what Michael Halwcrow and I
had tried designing years ago) was so you could backup a device that
had multiple users' files without having all of the users' keys ---
and then be able to restore them.  So for example, suppose you had a
tablet that is shared by multiple family members, and you want to be
backup all of the data on the shared device so that it could be
restored in case one of the kids drop the tablet in the swimming pool....

But in order to do that, you need to be able to restore the encrypted
files in the encrypted directories.  In practice, encrypted files
generally exist in encrypted directories.  That's because the typical
way fscrypt gets used is we set a policy on an empty directory, and
then all of the newly files created files have encrypted file names,
inherit the directory's encryption policy, and then have encrypted
file contents.

So do you have the encryption key, or not?  If you do have the
encryption key, then you can ignore the issue of the file name when
you open the file, but what's the reason why you would want to extract
out the raw encrypted data plus the raw encryption metadata?  You're
not going to be able to restore the encrypted file, in the encrypted
directory name.  Perhaps it's because you want to keep the data
encrypted while you're tranferring it --- but the filename needs to be
encrypted as well, and given modern CPU's, with or without
inline-crypto engines, the cost of decrypting the file data and then
re-encrypting it in the backup key isn't really that large.

If you don't have the encryption key, then you need to be able to open
the file using using the encrypted name (which fscrypt does support)
and then extract out the encrypted file name using another bundle of
encryption metadata.  So that's a bit more complicated, but it's
doable.

The *really* hard part is *restoring* an encrypted directory
hierarchy.  Michael and I did create a straw design proposal (which is
too small to fit in the margins of this e-mail :-), but suffice it to
say that the standard Posix system calls are not sufficient to be able
to create encrypted files and encrypted directories, and it would have
been messy as all hell.  Which is why we breathed a sign of relief
when the original product requirement of being able to do
backup/restore of shared devices went away.   :-)

The thing is, though, just being able to extract out regular files in
their raw encrypted on-disk form, along with their filename metadata,
seems to be a bit of a party trick without a compelling use case that
I can see.  But perhaps you have something in mind?

						- Ted

Omar Sandoval May 18, 2021, 8:38 a.m. UTC | #7

On Mon, May 17, 2021 at 10:53:16PM -0400, Theodore Y. Ts'o wrote:
> On Mon, May 17, 2021 at 04:25:15PM -0700, Omar Sandoval wrote:
> > > Well, assuming we're talking about regular files only (so file contents
> > > encryption, not filenames encryption),
> > 
> > Yes, I was thinking of regular files. File operations using encrypted
> > names sounds... interesting, but I think out of scope for this.
> 
> So the question I have is why would you want to get the raw encrypted
> data?  One possible reason (and this is what Michael Halwcrow and I
> had tried designing years ago) was so you could backup a device that
> had multiple users' files without having all of the users' keys ---
> and then be able to restore them.  So for example, suppose you had a
> tablet that is shared by multiple family members, and you want to be
> backup all of the data on the shared device so that it could be
> restored in case one of the kids drop the tablet in the swimming pool....
> 
> But in order to do that, you need to be able to restore the encrypted
> files in the encrypted directories.  In practice, encrypted files
> generally exist in encrypted directories.  That's because the typical
> way fscrypt gets used is we set a policy on an empty directory, and
> then all of the newly files created files have encrypted file names,
> inherit the directory's encryption policy, and then have encrypted
> file contents.
> 
> So do you have the encryption key, or not?  If you do have the
> encryption key, then you can ignore the issue of the file name when
> you open the file, but what's the reason why you would want to extract
> out the raw encrypted data plus the raw encryption metadata?  You're
> not going to be able to restore the encrypted file, in the encrypted
> directory name.  Perhaps it's because you want to keep the data
> encrypted while you're tranferring it --- but the filename needs to be
> encrypted as well, and given modern CPU's, with or without
> inline-crypto engines, the cost of decrypting the file data and then
> re-encrypting it in the backup key isn't really that large.
> 
> If you don't have the encryption key, then you need to be able to open
> the file using using the encrypted name (which fscrypt does support)
> and then extract out the encrypted file name using another bundle of
> encryption metadata.  So that's a bit more complicated, but it's
> doable.
> 
> The *really* hard part is *restoring* an encrypted directory
> hierarchy.  Michael and I did create a straw design proposal (which is
> too small to fit in the margins of this e-mail :-), but suffice it to
> say that the standard Posix system calls are not sufficient to be able
> to create encrypted files and encrypted directories, and it would have
> been messy as all hell.  Which is why we breathed a sign of relief
> when the original product requirement of being able to do
> backup/restore of shared devices went away.   :-)
> 
> The thing is, though, just being able to extract out regular files in
> their raw encrypted on-disk form, along with their filename metadata,
> seems to be a bit of a party trick without a compelling use case that
> I can see.  But perhaps you have something in mind?

Thanks for the detailed response, Ted. I personally don't have a use
case for reading and writing encrypted data. I only care about skipping
compression/decompression, but early on it was pointed out that this API
could potentially also be used for encrypted data. The question at this
point is: if someone else comes along and solves the problems with
restoring encrypted filenames, is this interface enough for them to
restore the encrypted file data? It seems like the answer is yes, with a
couple of additions to fscrypt. I should've been clearer that I don't
have concrete plans to do this, I just wanted to leave the door open for
it so that we don't need a second, very similar interface.

Thanks,
Omar

Theodore Ts'o May 18, 2021, 4:21 p.m. UTC | #8

On Tue, May 18, 2021 at 01:38:08AM -0700, Omar Sandoval wrote:
> Thanks for the detailed response, Ted. I personally don't have a use
> case for reading and writing encrypted data. I only care about skipping
> compression/decompression, but early on it was pointed out that this API
> could potentially also be used for encrypted data. The question at this
> point is: if someone else comes along and solves the problems with
> restoring encrypted filenames, is this interface enough for them to
> restore the encrypted file data? It seems like the answer is yes, with a
> couple of additions to fscrypt. I should've been clearer that I don't
> have concrete plans to do this, I just wanted to leave the door open for
> it so that we don't need a second, very similar interface.

Well, practically speaking, we would need to have a way to extract out
the encrypted file name information; and given that an encrypted file
could have hard links, we need to be able to obtain the encrypted file
name information for the dentry that was used to open that file.  This
arguably should be separate from the encryption information for data
stream itself, so if we want to handwave how we fetch the encrypted
filename info (maybe some magic ioctl, or maybe via using some kind of
magic RWF flag used for reading encrypted directories that are opened
via O_DIRECTORY, which sorta-works like readdir() but also returns
some additional metadata information for each directory entry), sure
it should be possible to use your proposed interface as a starting
point.

I'm not sure we want to try to design all of the details of how to get
the encrypted data plus encryption metadata for the data stream, but
in theory, so long as there is a way to get the encryption metadata,
sure, it could work.  One other approach is to just abuse the xattr
interface, the way we do with Posix ACL's and Capabilities, where the
on-disk format and the format used when we query a file's ACL via the
xattr interface don't necessary have to be identical.  I'm sure that
will result in howls of outrage in some quarters, but this is
something for which we have a precedent.

Cheers,

					- Ted

Omar Sandoval June 7, 2021, 7:27 p.m. UTC | #9

On Mon, May 17, 2021 at 02:32:47PM -0700, Linus Torvalds wrote:
> On Mon, May 17, 2021 at 11:35 AM Omar Sandoval <osandov@osandov.com> wrote:
> >
> > Patches 1-3 add the VFS support, UAPI, and documentation. Patches 4-7
> > are Btrfs prep patches. Patch 8 adds Btrfs encoded read support and
> > patch 9 adds Btrfs encoded write support.
> 
> I don't love the RWF_ENCODED flag, but if that's the way people think
> this should be done, as a model this looks reasonable to me.
> 
> I'm not sure what the deal with the encryption metadata is. I realize
> there is currently only one encryption type ("none") in this series,
> but it's not clear how any other encryption type would actually ever
> be described. It's not like you can pass in the key (well, I guess
> passing in the key would be fine, but passing it back out certainly
> would not be).  A key ID from a keyring?
> 
> So there's presumably some future plan for it, but it would be good to
> verify that that plan makes sense..

To summarize the discussion and answer your original question, using
RWF_ENCODED for encryption will require additional support for getting
encryption metadata, but that support is best suited for a separate
interface, with RWF_ENCODED purely for the encrypted data itself. The
harder part of encrypted backups is restoring filenames, and that would
also be best as a separate interface.

My use case is only for compression, so none of that is a blocker for
RWF_ENCODED.

What else can I do to move this along?

Thanks,
Omar

[RERESEND,v9,0/9] fs: interface for directly reading/writing compressed data

Message

Comments