Message ID | 20241010-tonyk-tmpfs-v6-1-79f0ae02e4c8@igalia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | tmpfs: Add case-insensitive support for tmpfs | expand |
André Almeida <andrealmeid@igalia.com> writes: > +static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qstr *name) > +{ > + if (!IS_CASEFOLDED(dir) || !sb_has_strict_encoding(dir->i_sb)) > + return true; > + > + /* > + * A casefold dir must have a encoding set, unless the filesystem > + * is corrupted > + */ > + if (WARN_ON_ONCE(!dir->i_sb->s_encoding)) > + return true; > + > + return utf8_validate(dir->i_sb->s_encoding, name); There is something fishy here. Concerningly, the fstests test doesn't catch it. utf8_validate is defined as: int utf8_validate(const struct unicode_map *um, const struct qstr *str) Which returns 0 on success and !0 on error. Thus, when casting to bool, the return code should be negated. But generic/556 doesn't fail. That's because we are over cautious, and also check the string at the end of generic_ci_d_hash. So we never really reach utf8_validate in the tested case. But if you comment the final if in generic_ci_d_hash, you'll see this patchset regresses the fstests case generic/556 over ext4. We really need the check in both places, though. We don't want to rely on the behavior of generic_ci_d_hash to block invalid filenames, as that might change.
Em 15/10/2024 12:59, Gabriel Krisman Bertazi escreveu: > André Almeida <andrealmeid@igalia.com> writes: > >> +static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qstr *name) >> +{ >> + if (!IS_CASEFOLDED(dir) || !sb_has_strict_encoding(dir->i_sb)) >> + return true; >> + >> + /* >> + * A casefold dir must have a encoding set, unless the filesystem >> + * is corrupted >> + */ >> + if (WARN_ON_ONCE(!dir->i_sb->s_encoding)) >> + return true; >> + >> + return utf8_validate(dir->i_sb->s_encoding, name); > > There is something fishy here. Concerningly, the fstests test doesn't > catch it. > > utf8_validate is defined as: > > int utf8_validate(const struct unicode_map *um, const struct qstr *str) > > Which returns 0 on success and !0 on error. Thus, when casting to bool, > the return code should be negated. > > But generic/556 doesn't fail. That's because we are over cautious, and > also check the string at the end of generic_ci_d_hash. So we never > really reach utf8_validate in the tested case. > > But if you comment the final if in generic_ci_d_hash, you'll see this > patchset regresses the fstests case generic/556 over ext4. > > We really need the check in both places, though. We don't want to rely > on the behavior of generic_ci_d_hash to block invalid filenames, as that > might change. > Thanks Krisman! Nice catch. I fixed this for the next version. Testing with the modified generic_ci_d_hash(), I also realized that the validation was in the wrong place and leaving an inode behind, this fixed: diff --git a/mm/shmem.c b/mm/shmem.c index eb1ea1f3b37c..7bd7ca5777af 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3624,13 +3624,13 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir, struct inode *inode; int error; + if (!generic_ci_validate_strict_name(dir, &dentry->d_name)) + return -EINVAL; + inode = shmem_get_inode(idmap, dir->i_sb, dir, mode, dev, VM_NORESERVE); if (IS_ERR(inode)) return PTR_ERR(inode); - if (!generic_ci_validate_strict_name(dir, &dentry->d_name)) - return -EINVAL; -
diff --git a/include/linux/fs.h b/include/linux/fs.h index 3559446279c152c70b7f1f3e0154f6e66a5aba33..9b232aee4cd6ad8dce64370db0111bd25d3fedfa 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -45,6 +45,7 @@ #include <linux/slab.h> #include <linux/maple_tree.h> #include <linux/rw_hint.h> +#include <linux/unicode.h> #include <asm/byteorder.h> #include <uapi/linux/fs.h> @@ -3456,6 +3457,50 @@ extern int generic_ci_match(const struct inode *parent, const struct qstr *folded_name, const u8 *de_name, u32 de_name_len); +#if IS_ENABLED(CONFIG_UNICODE) +/** + * generic_ci_validate_strict_name - Check if a given name is suitable + * for a directory + * + * This functions checks if the proposed filename is valid for the + * parent directory. That means that only valid UTF-8 filenames will be + * accepted for casefold directories from filesystems created with the + * strict encoding flag. That also means that any name will be + * accepted for directories that doesn't have casefold enabled, or + * aren't being strict with the encoding. + * + * @dir: inode of the directory where the new file will be created + * @name: name of the new file + * + * Return: + * * True if the filename is suitable for this directory. It can be + * true if a given name is not suitable for a strict encoding + * directory, but the directory being used isn't strict + * * False if the filename isn't suitable for this directory. This only + * happens when a directory is casefolded and the filesystem is strict + * about its encoding. + */ +static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qstr *name) +{ + if (!IS_CASEFOLDED(dir) || !sb_has_strict_encoding(dir->i_sb)) + return true; + + /* + * A casefold dir must have a encoding set, unless the filesystem + * is corrupted + */ + if (WARN_ON_ONCE(!dir->i_sb->s_encoding)) + return true; + + return utf8_validate(dir->i_sb->s_encoding, name); +} +#else +static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qstr *name) +{ + return true; +} +#endif + static inline bool sb_has_encoding(const struct super_block *sb) { #if IS_ENABLED(CONFIG_UNICODE)
Create a helper function for filesystems do the checks required for casefold directories and strict encoding. Suggested-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: André Almeida <andrealmeid@igalia.com> --- Changes from v4: - Inline this function Changes from v2: - Moved function to libfs and adpated its name - Wrapped at 72 chars column - Decomposed the big if (...) to be more clear --- include/linux/fs.h | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+)