Message ID | 20170418210642.6039-1-gwendal@chromium.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
+Cc linux-f2fs-devel@lists.sourceforge.net +Cc linux-mtd@lists.infradead.org (for ubifs) Hi Gwendal, On Tue, Apr 18, 2017 at 02:06:42PM -0700, Gwendal Grignou wrote: > If we use only 16 bytes, due to how CBC works, if the names have the > same beginning, their last ciphertext block (16 bytes) may be identical. > > It happens when two file names share the first 16k bytes and both have > length withn 16 * n + 13 and 16 * n + 16. > > Instead use 32 bytes to build the filenames from encrypted data when > directory is scrambled. Just some background for people who may be unfamiliar with what's going on here (and it may be useful to include some of this in the patch description): When accessing files without access to the key, userspace needs to operate on a filename derived from the ciphertext filename, which contains arbitrary bytes. But since it's supported to use filenames up to FILENAME_MAX (255 bytes) in length when using encryption, we can't always base-64 encode the filename, since that may make it too long. The way this is solved currently is that for filenames with ciphertext length greater than 32 bytes, the filesystem provides an 8-byte "cookie" (split into 'hash' and 'minor_hash'), which along with the last 16 bytes of the filename ciphertext is base-64 encoded into a fixed-length name. The filesystem returns this on readdir. Then, when a lookup is done, the filesystem translates this info back into a specific directory entry. Since ext4 directory entries do not contain a hash field, ext4 relies only on the 16 bytes of ciphertext to distinguish collisions within a directory block. Unfortunately, this is broken because with the encryption mode used for filenames (CTS), the ciphertext of the last 16-byte block depends only on the plaintext up to and including the *second to last* block, not up to the last block. This causes long filenames that differ just near the end to collide. We could fix this by using the second to last block of ciphertext rather than the last one. However, using the last *two* blocks as you're proposing should be fine too. Of course we could also hash the filename's ciphertext with SHA-256 or something, but it's nice to take advantage of the encryption mode, and not have to do yet another hash. I am not too worried about changing the way encrypted filenames are presented, since applications are not supposed to rely on this. (Though we probably should be doing something to catch broken applications, like encoding the filenames slightly differently after each reboot...) Strangely, f2fs and ubifs do not use the bytes from the filename at all when trying to find a specific directory entry in this case. So this patch doesn't really affect them. This seems unreliable; perhaps we should introduce a function like "fscrypt_name_matches()" which all the filesystems could call? Can any of the f2fs and ubifs developers explain why they don't look at any bytes from the filename? Anyway, a couple nits on this patch: > + oname->len = 1 + digest_encode( > + buf, > + FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), > + oname->name + 1); > return 0; > } Use 'sizeof(buf)' > > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c > index c4a389a6027b..14b2a2335a32 100644 > --- a/fs/ext4/namei.c > +++ b/fs/ext4/namei.c > @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname, > int ret; > if (de->name_len < 16) > return 0; de->name_len < 32 (or replace 32 with FS_FNAME_CRYPTO_DIGEST_SIZE, here and below) - Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Apr 18, 2017, at 3:06 PM, Gwendal Grignou <gwendal@chromium.org> wrote: > Subject: [PATCH] fscrypt: use 32 bytes of encrypted filename > If we use only 16 bytes, due to how CBC works, if the names have the > same beginning, their last ciphertext block (16 bytes) may be identical. What is missing from the patch summary is: use the encrypted filename for _what_? > > It happens when two file names share the first 16k bytes and both have "16k bytes" for the file names? Presumably you don't mean "16KB", but rather "16n" or "multiple of 16 bytes" would be more clear. > length withn 16 * n + 13 and 16 * n + 16. > > Instead use 32 bytes to build the filenames from encrypted data when > directory is scrambled. > > The drawback is the scrambled filenames change after applying the patch. > Consider an encrypted directory with: > > ls -il > total 8 > 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10 > system@framework@boot-telephony-common.art.crc > 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 > system@framework@boot-telephony-common.oat.crc > > Once the key is invalidated, without the patch, ls -li produces: > 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 > _a1Psh01n8FdhC8s9pUywlAyFzlz7n6C3 > 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 > _wJS,0akq14ehC8s9pUywlAyFzlz7n6C3 > > Both files show with the same inode. > > After the patch, the names are different, but the inode information is > now correct: > ls -li > 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10 > _a1Psh01n8FtxbeglW8BqhuthSUxMqh6cFKwz2nSJDXCIXMXOvfqLcD > 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 > _wJS,0akq14eJcuQks7f2Vsg,zE0Jdz98FKwz2nSJDXCIXMXOvfqLcD It isn't clear to me what the difference is here. In the first case (without patch) the last 21 characters of the filename are the same, but the first 11 characters are still different, so it isn't clear why that isn't enough to distinguish the files, and why checking the whole filename isn't enough to properly determine the inode number? In the second case (with patch) the last 22 characters are also the same. > Tested only on ext4. > > Signed-off-by: Gwendal Grignou <gwendal@chromium.org> > --- > Script to reproduce the error: > > BASE_DIR="~/" > DIR="${BASE_DIR}/tmp" > # Create directory. > mkdir -p "${DIR}" > echo foobar | e4crypt add_key "${DIR}" > # Fill directory. > cd "${DIR}" > touch system@framework@boot-telephony-common.oat.crc > touch system@framework@boot-telephony-common.art.crc > cd .. > # Check files have different inode. > ls -il "${DIR}" > # Invalidate key > KEY="$(keyctl show | grep $(e4crypt get_policy "${DIR}" | cut -d ':' -f 2) | sed -ne 's/\(.*\) --al.*/\1/p')" > sync > keyctl invalidate "${KEY}" > echo 3 > /proc/sys/vm/drop_caches > # Once the key is invalidated, both files have the same inode: > ls -il "${DIR}" > if [ $(ls -i1 "${DIR}" | cut -d ' ' -f 1 | uniq | wc -l) -eq 1 ] ; then > echo same inode! > fi > # if we try to remove the directory, we will get an error > # : Structure needs cleaning > # rm -rf "${DIR}" > > fs/crypto/fname.c | 20 +++++++++++++------- > fs/ext4/namei.c | 4 ++-- > 2 files changed, 15 insertions(+), 9 deletions(-) > > diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c > index 80bb956e14e5..71ddc3eaa62d 100644 > --- a/fs/crypto/fname.c > +++ b/fs/crypto/fname.c > @@ -274,7 +274,7 @@ int fscrypt_fname_disk_to_usr(struct inode *inode, > struct fscrypt_str *oname) > { > const struct qstr qname = FSTR_TO_QSTR(iname); > - char buf[24]; > + char buf[FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64)]; > > if (fscrypt_is_dot_dotdot(&qname)) { > oname->name[0] = '.'; > @@ -295,14 +295,19 @@ int fscrypt_fname_disk_to_usr(struct inode *inode, > return 0; > } > if (hash) { > - memcpy(buf, &hash, 4); > - memcpy(buf + 4, &minor_hash, 4); > + memcpy(buf, &hash, sizeof(u32)); > + memcpy(buf + 4, &minor_hash, sizeof(u32)); Should "4" be replaced with something here? sizeof(u32), or something else? > } else { > memset(buf, 0, 8); Likewise, should "8" be replaced with sizeof(u64) as it is with other changes? > } > - memcpy(buf + 8, iname->name + iname->len - 16, 16); > + memcpy(buf + sizeof(u64), > + iname->name + iname->len - FS_FNAME_CRYPTO_DIGEST_SIZE, > + FS_FNAME_CRYPTO_DIGEST_SIZE); > oname->name[0] = '_'; > - oname->len = 1 + digest_encode(buf, 24, oname->name + 1); > + oname->len = 1 + digest_encode( > + buf, > + FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), > + oname->name + 1); > return 0; > } > EXPORT_SYMBOL(fscrypt_fname_disk_to_usr); > @@ -375,10 +380,11 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname, > */ > if (iname->name[0] == '_') > bigname = 1; > - if ((bigname && (iname->len != 33)) || (!bigname && (iname->len > 43))) > + if ((bigname && iname->len != 55) || (!bigname && (iname->len > 43))) The "55" constant is no better than "33" in terms of clarity, nor is "43". Having a (computed) constant for this would be more helpful, for example 2 * sizeof(buf) + 1 or whatever. > return -ENOENT; > > - fname->crypto_buf.name = kmalloc(32, GFP_KERNEL); > + fname->crypto_buf.name = kmalloc( > + FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), GFP_KERNEL); > if (fname->crypto_buf.name == NULL) > return -ENOMEM; > > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c > index c4a389a6027b..14b2a2335a32 100644 > --- a/fs/ext4/namei.c > +++ b/fs/ext4/namei.c > @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname, > int ret; > if (de->name_len < 16) > return 0; > - ret = memcmp(de->name + de->name_len - 16, > - fname->crypto_buf.name + 8, 16); > + ret = memcmp(de->name + de->name_len - 32, > + fname->crypto_buf.name + 8, 32); > return (ret == 0) ? 1 : 0; This could just be: return (ret == 0); > } > name = fname->crypto_buf.name; > len = fname->crypto_buf.len; > } > #endif > if (de->name_len != len) > return 0; > return (memcmp(de->name, name, len) == 0) ? 1 : 0; And similarly: return (memcmp(de->name, name, len) == 0); Cheers, Andreas
On Tue, Apr 18, 2017 at 11:06 PM, Gwendal Grignou <gwendal@chromium.org> wrote: > If we use only 16 bytes, due to how CBC works, if the names have the > same beginning, their last ciphertext block (16 bytes) may be identical. > > It happens when two file names share the first 16k bytes and both have > length withn 16 * n + 13 and 16 * n + 16. > > Instead use 32 bytes to build the filenames from encrypted data when > directory is scrambled. > > The drawback is the scrambled filenames change after applying the patch. > Consider an encrypted directory with: > > ls -il > total 8 > 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10 > system@framework@boot-telephony-common.art.crc > 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 > system@framework@boot-telephony-common.oat.crc > > Once the key is invalidated, without the patch, ls -li produces: > 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 > _a1Psh01n8FdhC8s9pUywlAyFzlz7n6C3 > 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 > _wJS,0akq14ehC8s9pUywlAyFzlz7n6C3 > > Both files show with the same inode. > > After the patch, the names are different, but the inode information is > now correct: > ls -li > 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10 > _a1Psh01n8FtxbeglW8BqhuthSUxMqh6cFKwz2nSJDXCIXMXOvfqLcD > 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 > _wJS,0akq14eJcuQks7f2Vsg,zE0Jdz98FKwz2nSJDXCIXMXOvfqLcD > > Tested only on ext4. I hope you classify this patch as RFC then. We'll have problems when you just develop and test for ext4. :-) > Signed-off-by: Gwendal Grignou <gwendal@chromium.org> > --- > Script to reproduce the error: > > BASE_DIR="~/" > DIR="${BASE_DIR}/tmp" > # Create directory. > mkdir -p "${DIR}" > echo foobar | e4crypt add_key "${DIR}" > # Fill directory. > cd "${DIR}" > touch system@framework@boot-telephony-common.oat.crc > touch system@framework@boot-telephony-common.art.crc > cd .. > # Check files have different inode. > ls -il "${DIR}" > # Invalidate key > KEY="$(keyctl show | grep $(e4crypt get_policy "${DIR}" | cut -d ':' -f 2) | sed -ne 's/\(.*\) --al.*/\1/p')" > sync > keyctl invalidate "${KEY}" > echo 3 > /proc/sys/vm/drop_caches > # Once the key is invalidated, both files have the same inode: > ls -il "${DIR}" > if [ $(ls -i1 "${DIR}" | cut -d ' ' -f 1 | uniq | wc -l) -eq 1 ] ; then > echo same inode! > fi > # if we try to remove the directory, we will get an error > # : Structure needs cleaning > # rm -rf "${DIR}" > > fs/crypto/fname.c | 20 +++++++++++++------- > fs/ext4/namei.c | 4 ++-- > 2 files changed, 15 insertions(+), 9 deletions(-) > > diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c > index 80bb956e14e5..71ddc3eaa62d 100644 > --- a/fs/crypto/fname.c > +++ b/fs/crypto/fname.c > @@ -274,7 +274,7 @@ int fscrypt_fname_disk_to_usr(struct inode *inode, > struct fscrypt_str *oname) > { > const struct qstr qname = FSTR_TO_QSTR(iname); > - char buf[24]; > + char buf[FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64)]; > > if (fscrypt_is_dot_dotdot(&qname)) { > oname->name[0] = '.'; > @@ -295,14 +295,19 @@ int fscrypt_fname_disk_to_usr(struct inode *inode, > return 0; > } > if (hash) { > - memcpy(buf, &hash, 4); > - memcpy(buf + 4, &minor_hash, 4); > + memcpy(buf, &hash, sizeof(u32)); > + memcpy(buf + 4, &minor_hash, sizeof(u32)); > } else { > memset(buf, 0, 8); > } > - memcpy(buf + 8, iname->name + iname->len - 16, 16); > + memcpy(buf + sizeof(u64), > + iname->name + iname->len - FS_FNAME_CRYPTO_DIGEST_SIZE, > + FS_FNAME_CRYPTO_DIGEST_SIZE); > oname->name[0] = '_'; > - oname->len = 1 + digest_encode(buf, 24, oname->name + 1); > + oname->len = 1 + digest_encode( > + buf, > + FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), > + oname->name + 1); > return 0; > } > EXPORT_SYMBOL(fscrypt_fname_disk_to_usr); > @@ -375,10 +380,11 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname, > */ > if (iname->name[0] == '_') > bigname = 1; > - if ((bigname && (iname->len != 33)) || (!bigname && (iname->len > 43))) > + if ((bigname && iname->len != 55) || (!bigname && (iname->len > 43))) > return -ENOENT; > > - fname->crypto_buf.name = kmalloc(32, GFP_KERNEL); > + fname->crypto_buf.name = kmalloc( > + FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), GFP_KERNEL); > if (fname->crypto_buf.name == NULL) > return -ENOMEM; > > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c > index c4a389a6027b..14b2a2335a32 100644 > --- a/fs/ext4/namei.c > +++ b/fs/ext4/namei.c > @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname, > int ret; > if (de->name_len < 16) > return 0; > - ret = memcmp(de->name + de->name_len - 16, > - fname->crypto_buf.name + 8, 16); > + ret = memcmp(de->name + de->name_len - 32, > + fname->crypto_buf.name + 8, 32); > return (ret == 0) ? 1 : 0; > } > name = fname->crypto_buf.name; Can the code still be able to read filenames which have been encrypted using the "old" scheme?
Eric, On Wed, Apr 19, 2017 at 1:01 AM, Eric Biggers <ebiggers3@gmail.com> wrote: > +Cc linux-f2fs-devel@lists.sourceforge.net > +Cc linux-mtd@lists.infradead.org (for ubifs) > > Hi Gwendal, > > On Tue, Apr 18, 2017 at 02:06:42PM -0700, Gwendal Grignou wrote: >> If we use only 16 bytes, due to how CBC works, if the names have the >> same beginning, their last ciphertext block (16 bytes) may be identical. >> >> It happens when two file names share the first 16k bytes and both have >> length withn 16 * n + 13 and 16 * n + 16. >> >> Instead use 32 bytes to build the filenames from encrypted data when >> directory is scrambled. > > Just some background for people who may be unfamiliar with what's going on here > (and it may be useful to include some of this in the patch description): > > When accessing files without access to the key, userspace needs to operate on a > filename derived from the ciphertext filename, which contains arbitrary bytes. > > But since it's supported to use filenames up to FILENAME_MAX (255 bytes) in > length when using encryption, we can't always base-64 encode the filename, since > that may make it too long. > > The way this is solved currently is that for filenames with ciphertext length > greater than 32 bytes, the filesystem provides an 8-byte "cookie" (split into > 'hash' and 'minor_hash'), which along with the last 16 bytes of the filename > ciphertext is base-64 encoded into a fixed-length name. The filesystem returns > this on readdir. Then, when a lookup is done, the filesystem translates this > info back into a specific directory entry. > > Since ext4 directory entries do not contain a hash field, ext4 relies only on > the 16 bytes of ciphertext to distinguish collisions within a directory block. > Unfortunately, this is broken because with the encryption mode used for > filenames (CTS), the ciphertext of the last 16-byte block depends only on the > plaintext up to and including the *second to last* block, not up to the last > block. This causes long filenames that differ just near the end to collide. > > We could fix this by using the second to last block of ciphertext rather than > the last one. However, using the last *two* blocks as you're proposing should > be fine too. > > Of course we could also hash the filename's ciphertext with SHA-256 or > something, but it's nice to take advantage of the encryption mode, and not have > to do yet another hash. > > I am not too worried about changing the way encrypted filenames are presented, > since applications are not supposed to rely on this. (Though we probably should > be doing something to catch broken applications, like encoding the filenames > slightly differently after each reboot...) > > Strangely, f2fs and ubifs do not use the bytes from the filename at all when > trying to find a specific directory entry in this case. So this patch doesn't > really affect them. This seems unreliable; perhaps we should introduce a > function like "fscrypt_name_matches()" which all the filesystems could call? > Can any of the f2fs and ubifs developers explain why they don't look at any > bytes from the filename? Not sure if I understand you correctly, but for long filenames UBIFS does a lookup by hash/cookie, not by filename.
On Wed, Apr 19, 2017 at 3:37 PM, Richard Weinberger <richard.weinberger@gmail.com> wrote: > Can the code still be able to read filenames which have been encrypted > using the "old" scheme? Bah, typing is hard. Should be read: Will the code still be able to read ...
Hi Richard, On Wed, Apr 19, 2017 at 03:37:42PM +0200, Richard Weinberger wrote: > > > > Tested only on ext4. > > I hope you classify this patch as RFC then. > We'll have problems when you just develop and test for ext4. :-) > It's a little difficult for people to test stuff on UBIFS without a turn-key solution like kvm-xfstests where they can just run something like 'kvm-xfstests -c ext4,f2fs,ubifs -g encrypt'. I did post patches to add UBIFS support to xfstests and kvm-xfstests a few months ago; maybe you're interested in taking them over and working to get them merged? > > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c > > index c4a389a6027b..14b2a2335a32 100644 > > --- a/fs/ext4/namei.c > > +++ b/fs/ext4/namei.c > > @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname, > > int ret; > > if (de->name_len < 16) > > return 0; > > - ret = memcmp(de->name + de->name_len - 16, > > - fname->crypto_buf.name + 8, 16); > > + ret = memcmp(de->name + de->name_len - 32, > > + fname->crypto_buf.name + 8, 32); > > return (ret == 0) ? 1 : 0; > > } > > name = fname->crypto_buf.name; > > Can the code still be able to read filenames which have been encrypted > using the "old" scheme? > The patch only changes the presentation of long encrypted filenames when accessed without the key. It doesn't change how filenames are encrypted. - Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric, Am 19.04.2017 um 19:09 schrieb Eric Biggers: > Hi Richard, > > On Wed, Apr 19, 2017 at 03:37:42PM +0200, Richard Weinberger wrote: >>> >>> Tested only on ext4. >> >> I hope you classify this patch as RFC then. >> We'll have problems when you just develop and test for ext4. :-) >> > > It's a little difficult for people to test stuff on UBIFS without a turn-key > solution like kvm-xfstests where they can just run something like > 'kvm-xfstests -c ext4,f2fs,ubifs -g encrypt'. > > I did post patches to add UBIFS support to xfstests and kvm-xfstests a few > months ago; maybe you're interested in taking them over and working to get them > merged? I assigned this talk already to David. He can tell what the status is. >>> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c >>> index c4a389a6027b..14b2a2335a32 100644 >>> --- a/fs/ext4/namei.c >>> +++ b/fs/ext4/namei.c >>> @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname, >>> int ret; >>> if (de->name_len < 16) >>> return 0; >>> - ret = memcmp(de->name + de->name_len - 16, >>> - fname->crypto_buf.name + 8, 16); >>> + ret = memcmp(de->name + de->name_len - 32, >>> + fname->crypto_buf.name + 8, 32); >>> return (ret == 0) ? 1 : 0; >>> } >>> name = fname->crypto_buf.name; >> >> Can the code still be able to read filenames which have been encrypted >> using the "old" scheme? >> > > The patch only changes the presentation of long encrypted filenames when > accessed without the key. It doesn't change how filenames are encrypted. Thanks for pointing this out. Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 19, 2017 at 03:40:13PM +0200, Richard Weinberger wrote: > > Strangely, f2fs and ubifs do not use the bytes from the filename at all when > > trying to find a specific directory entry in this case. So this patch doesn't > > really affect them. This seems unreliable; perhaps we should introduce a > > function like "fscrypt_name_matches()" which all the filesystems could call? > > Can any of the f2fs and ubifs developers explain why they don't look at any > > bytes from the filename? > > Not sure if I understand you correctly, but for long filenames UBIFS > does a lookup > by hash/cookie, not by filename. > Well, like I said to Jaegeuk for F2FS, that's what the code does, but _why_? Like F2FS, it's probably not the case that the hash is sufficient to reliably identify a directory entry. Granted, UBIFS does it a lot better than F2FS since UBIFS uses two 32-bit hashes rather than just one, but it seems the second hash may be neither necessary nor sufficient to identify a specific directory entry, and it should be looking at the bytes of ciphertext from the filename instead, like what ext4 does. (Provided that is fixed to account for how CTS mode encryption works.) - Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 19.04.2017 um 19:16 schrieb Eric Biggers: > On Wed, Apr 19, 2017 at 03:40:13PM +0200, Richard Weinberger wrote: >>> Strangely, f2fs and ubifs do not use the bytes from the filename at all when >>> trying to find a specific directory entry in this case. So this patch doesn't >>> really affect them. This seems unreliable; perhaps we should introduce a >>> function like "fscrypt_name_matches()" which all the filesystems could call? >>> Can any of the f2fs and ubifs developers explain why they don't look at any >>> bytes from the filename? >> >> Not sure if I understand you correctly, but for long filenames UBIFS >> does a lookup >> by hash/cookie, not by filename. >> > > Well, like I said to Jaegeuk for F2FS, that's what the code does, but _why_? > Like F2FS, it's probably not the case that the hash is sufficient to reliably > identify a directory entry. Granted, UBIFS does it a lot better than F2FS since > UBIFS uses two 32-bit hashes rather than just one, but it seems the second hash > may be neither necessary nor sufficient to identify a specific directory entry, > and it should be looking at the bytes of ciphertext from the filename instead, > like what ext4 does. (Provided that is fixed to account for how CTS mode > encryption works.) Let me dig into this, maybe I made a boo boo. The idea was looking up by the filename hash and resolving possible collisions using the secondary hash. Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/19/2017 07:12 PM, Richard Weinberger wrote: > Eric, > > Am 19.04.2017 um 19:09 schrieb Eric Biggers: >> Hi Richard, >> >> On Wed, Apr 19, 2017 at 03:37:42PM +0200, Richard Weinberger wrote: >>>> >>>> Tested only on ext4. >>> >>> I hope you classify this patch as RFC then. >>> We'll have problems when you just develop and test for ext4. :-) >>> >> >> It's a little difficult for people to test stuff on UBIFS without a turn-key >> solution like kvm-xfstests where they can just run something like >> 'kvm-xfstests -c ext4,f2fs,ubifs -g encrypt'. >> >> I did post patches to add UBIFS support to xfstests and kvm-xfstests a few >> months ago; maybe you're interested in taking them over and working to get them >> merged? > > I assigned this talk already to David. > He can tell what the status is. > What I have right now is mostly similar to the previous patch for xfstests-dev that was submitted to the mailing list. I haven't looked into the xfstests-bld patch yet. To sumarize what happend so far: A while ago, I took a look at Erics xfstest patches and the similar patch series from Dongsheng Yang. Based on those, I got the xfstests running with UBIFS on nandsim inside a VM with fairly little changes. AFAIR there were two tests failing, one of them being generic/129, apparently because it exhausts the space on one of the UBI volumes, and another one for which a fix was provided. My work on the xfstests-dev patch was preempted by other projects, but I briefly got back to it when an O_TMPFILE regression was reported on the mtd mailing list. This should have been caught by generic/004 but wasn't because the UBIFS error message didn't match the grep pattern run on dmesg. Other such cases may exist. Today, after rebasing/porting my local changes to upstream xfstets-dev, I ran the tests against on both Richards UBIFS tree and Linus' tree. On both kernels I'm now getting 5 failing tests out of 94 (with one of them being generic/129 again). Further work on some of the test scripts is required. David -- To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric, On Wed, Apr 19, 2017 at 7:21 PM, Richard Weinberger <richard@nod.at> wrote: >> Well, like I said to Jaegeuk for F2FS, that's what the code does, but _why_? >> Like F2FS, it's probably not the case that the hash is sufficient to reliably >> identify a directory entry. Granted, UBIFS does it a lot better than F2FS since >> UBIFS uses two 32-bit hashes rather than just one, but it seems the second hash >> may be neither necessary nor sufficient to identify a specific directory entry, >> and it should be looking at the bytes of ciphertext from the filename instead, >> like what ext4 does. (Provided that is fixed to account for how CTS mode >> encryption works.) > > Let me dig into this, maybe I made a boo boo. > The idea was looking up by the filename hash and resolving > possible collisions using the secondary hash. In ubifs_lookup() we handle two cases: 1. lookup of a bigname, both fscrypt_name->hash and ->minor_hash are valid. ->hash is r5(diskname) and ->minor_hash is the secondary hash, AKA cookie. UBIFS fed this hashes in ubifs_readdir() to fscrypt. 2. lookup of a non-bigname, in this case we compute r5(diskname) and use the diskname itself for lookups. So, in case 1 we avoid collisions by using a 64bit key and in case 2 by using the 32bit key plus a linear search and memcmp() of diskname.
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c index 80bb956e14e5..71ddc3eaa62d 100644 --- a/fs/crypto/fname.c +++ b/fs/crypto/fname.c @@ -274,7 +274,7 @@ int fscrypt_fname_disk_to_usr(struct inode *inode, struct fscrypt_str *oname) { const struct qstr qname = FSTR_TO_QSTR(iname); - char buf[24]; + char buf[FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64)]; if (fscrypt_is_dot_dotdot(&qname)) { oname->name[0] = '.'; @@ -295,14 +295,19 @@ int fscrypt_fname_disk_to_usr(struct inode *inode, return 0; } if (hash) { - memcpy(buf, &hash, 4); - memcpy(buf + 4, &minor_hash, 4); + memcpy(buf, &hash, sizeof(u32)); + memcpy(buf + 4, &minor_hash, sizeof(u32)); } else { memset(buf, 0, 8); } - memcpy(buf + 8, iname->name + iname->len - 16, 16); + memcpy(buf + sizeof(u64), + iname->name + iname->len - FS_FNAME_CRYPTO_DIGEST_SIZE, + FS_FNAME_CRYPTO_DIGEST_SIZE); oname->name[0] = '_'; - oname->len = 1 + digest_encode(buf, 24, oname->name + 1); + oname->len = 1 + digest_encode( + buf, + FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), + oname->name + 1); return 0; } EXPORT_SYMBOL(fscrypt_fname_disk_to_usr); @@ -375,10 +380,11 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname, */ if (iname->name[0] == '_') bigname = 1; - if ((bigname && (iname->len != 33)) || (!bigname && (iname->len > 43))) + if ((bigname && iname->len != 55) || (!bigname && (iname->len > 43))) return -ENOENT; - fname->crypto_buf.name = kmalloc(32, GFP_KERNEL); + fname->crypto_buf.name = kmalloc( + FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), GFP_KERNEL); if (fname->crypto_buf.name == NULL) return -ENOMEM; diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index c4a389a6027b..14b2a2335a32 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname, int ret; if (de->name_len < 16) return 0; - ret = memcmp(de->name + de->name_len - 16, - fname->crypto_buf.name + 8, 16); + ret = memcmp(de->name + de->name_len - 32, + fname->crypto_buf.name + 8, 32); return (ret == 0) ? 1 : 0; } name = fname->crypto_buf.name;
If we use only 16 bytes, due to how CBC works, if the names have the same beginning, their last ciphertext block (16 bytes) may be identical. It happens when two file names share the first 16k bytes and both have length withn 16 * n + 13 and 16 * n + 16. Instead use 32 bytes to build the filenames from encrypted data when directory is scrambled. The drawback is the scrambled filenames change after applying the patch. Consider an encrypted directory with: ls -il total 8 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10 system@framework@boot-telephony-common.art.crc 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 system@framework@boot-telephony-common.oat.crc Once the key is invalidated, without the patch, ls -li produces: 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 _a1Psh01n8FdhC8s9pUywlAyFzlz7n6C3 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 _wJS,0akq14ehC8s9pUywlAyFzlz7n6C3 Both files show with the same inode. After the patch, the names are different, but the inode information is now correct: ls -li 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10 _a1Psh01n8FtxbeglW8BqhuthSUxMqh6cFKwz2nSJDXCIXMXOvfqLcD 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10 _wJS,0akq14eJcuQks7f2Vsg,zE0Jdz98FKwz2nSJDXCIXMXOvfqLcD Tested only on ext4. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> --- Script to reproduce the error: BASE_DIR="~/" DIR="${BASE_DIR}/tmp" # Create directory. mkdir -p "${DIR}" echo foobar | e4crypt add_key "${DIR}" # Fill directory. cd "${DIR}" touch system@framework@boot-telephony-common.oat.crc touch system@framework@boot-telephony-common.art.crc cd .. # Check files have different inode. ls -il "${DIR}" # Invalidate key KEY="$(keyctl show | grep $(e4crypt get_policy "${DIR}" | cut -d ':' -f 2) | sed -ne 's/\(.*\) --al.*/\1/p')" sync keyctl invalidate "${KEY}" echo 3 > /proc/sys/vm/drop_caches # Once the key is invalidated, both files have the same inode: ls -il "${DIR}" if [ $(ls -i1 "${DIR}" | cut -d ' ' -f 1 | uniq | wc -l) -eq 1 ] ; then echo same inode! fi # if we try to remove the directory, we will get an error # : Structure needs cleaning # rm -rf "${DIR}" fs/crypto/fname.c | 20 +++++++++++++------- fs/ext4/namei.c | 4 ++-- 2 files changed, 15 insertions(+), 9 deletions(-)