Message ID | 20240622154904.3774273-4-yu.ma@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | fs/file.c: optimize the critical section of file_lock in | expand |
On Sat 22-06-24 11:49:04, Yu Ma wrote: > alloc_fd() has a sanity check inside to make sure the struct file mapping to the > allocated fd is NULL. Remove this sanity check since it can be assured by > exisitng zero initilization and NULL set when recycling fd. ^^^ existing ^^^ initialization Well, since this is a sanity check, it is expected it never hits. Yet searching the web shows it has hit a few times in the past :). So would wrapping this with unlikely() give a similar performance gain while keeping debugability? If unlikely() does not help, I agree we can remove this since fd_install() actually has the same check: BUG_ON(fdt->fd[fd] != NULL); and there we need the cacheline anyway so performance impact is minimal. Now, this condition in alloc_fd() is nice that it does not take the kernel down so perhaps we could change the BUG_ON to WARN() dumping similar kind of info as alloc_fd()? Honza > Combined with patch 1 and 2 in series, pts/blogbench-1.1.0 read improved by > 32%, write improved by 17% on Intel ICX 160 cores configuration with v6.10-rc4. > > Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> > Signed-off-by: Yu Ma <yu.ma@intel.com> > --- > fs/file.c | 7 ------- > 1 file changed, 7 deletions(-) > > diff --git a/fs/file.c b/fs/file.c > index b4d25f6d4c19..1153b0b7ba3d 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -555,13 +555,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) > else > __clear_close_on_exec(fd, fdt); > error = fd; > -#if 1 > - /* Sanity check */ > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > - rcu_assign_pointer(fdt->fd[fd], NULL); > - } > -#endif > > out: > spin_unlock(&files->file_lock); > -- > 2.43.0 >
On Tue, Jun 25, 2024 at 2:08 PM Jan Kara <jack@suse.cz> wrote: > > On Sat 22-06-24 11:49:04, Yu Ma wrote: > > alloc_fd() has a sanity check inside to make sure the struct file mapping to the > > allocated fd is NULL. Remove this sanity check since it can be assured by > > exisitng zero initilization and NULL set when recycling fd. > ^^^ existing ^^^ initialization > > Well, since this is a sanity check, it is expected it never hits. Yet > searching the web shows it has hit a few times in the past :). So would > wrapping this with unlikely() give a similar performance gain while keeping > debugability? If unlikely() does not help, I agree we can remove this since > fd_install() actually has the same check: > > BUG_ON(fdt->fd[fd] != NULL); > > and there we need the cacheline anyway so performance impact is minimal. > Now, this condition in alloc_fd() is nice that it does not take the kernel > down so perhaps we could change the BUG_ON to WARN() dumping similar kind > of info as alloc_fd()? > Christian suggested just removing it. To my understanding the problem is not the branch per se, but the the cacheline bounce of the fd array induced by reading the status. Note the thing also nullifies the pointer, kind of defeating the BUG_ON in fd_install. I'm guessing it's not going to hurt to branch on it after releasing the lock and forego nullifying, more or less: diff --git a/fs/file.c b/fs/file.c index a3b72aa64f11..d22b867db246 100644 --- a/fs/file.c +++ b/fs/file.c @@ -524,11 +524,11 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) */ error = -EMFILE; if (fd >= end) - goto out; + goto out_locked; error = expand_files(files, fd); if (error < 0) - goto out; + goto out_locked; /* * If we needed to expand the fs array we @@ -546,15 +546,15 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) else __clear_close_on_exec(fd, fdt); error = fd; -#if 1 - /* Sanity check */ - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { + spin_unlock(&files->file_lock); + + if (unlikely(rcu_access_pointer(fdt->fd[fd]) != NULL)) { printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); - rcu_assign_pointer(fdt->fd[fd], NULL); } -#endif -out: + return error; + +out_locked: spin_unlock(&files->file_lock); return error; } > Honza > > > Combined with patch 1 and 2 in series, pts/blogbench-1.1.0 read improved by > > 32%, write improved by 17% on Intel ICX 160 cores configuration with v6.10-rc4. > > > > Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> > > Signed-off-by: Yu Ma <yu.ma@intel.com> > > --- > > fs/file.c | 7 ------- > > 1 file changed, 7 deletions(-) > > > > diff --git a/fs/file.c b/fs/file.c > > index b4d25f6d4c19..1153b0b7ba3d 100644 > > --- a/fs/file.c > > +++ b/fs/file.c > > @@ -555,13 +555,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) > > else > > __clear_close_on_exec(fd, fdt); > > error = fd; > > -#if 1 > > - /* Sanity check */ > > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > > - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > > - rcu_assign_pointer(fdt->fd[fd], NULL); > > - } > > -#endif > > > > out: > > spin_unlock(&files->file_lock); > > -- > > 2.43.0 > > > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR
On Tue, Jun 25, 2024 at 3:09 PM Mateusz Guzik <mjguzik@gmail.com> wrote: > > On Tue, Jun 25, 2024 at 2:08 PM Jan Kara <jack@suse.cz> wrote: > > > > On Sat 22-06-24 11:49:04, Yu Ma wrote: > > > alloc_fd() has a sanity check inside to make sure the struct file mapping to the > > > allocated fd is NULL. Remove this sanity check since it can be assured by > > > exisitng zero initilization and NULL set when recycling fd. > > ^^^ existing ^^^ initialization > > > > Well, since this is a sanity check, it is expected it never hits. Yet > > searching the web shows it has hit a few times in the past :). So would > > wrapping this with unlikely() give a similar performance gain while keeping > > debugability? If unlikely() does not help, I agree we can remove this since > > fd_install() actually has the same check: > > > > BUG_ON(fdt->fd[fd] != NULL); > > > > and there we need the cacheline anyway so performance impact is minimal. > > Now, this condition in alloc_fd() is nice that it does not take the kernel > > down so perhaps we could change the BUG_ON to WARN() dumping similar kind > > of info as alloc_fd()? > > > > Christian suggested just removing it. > > To my understanding the problem is not the branch per se, but the the > cacheline bounce of the fd array induced by reading the status. > > Note the thing also nullifies the pointer, kind of defeating the > BUG_ON in fd_install. > > I'm guessing it's not going to hurt to branch on it after releasing > the lock and forego nullifying, more or less: > diff --git a/fs/file.c b/fs/file.c > index a3b72aa64f11..d22b867db246 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -524,11 +524,11 @@ static int alloc_fd(unsigned start, unsigned > end, unsigned flags) > */ > error = -EMFILE; > if (fd >= end) > - goto out; > + goto out_locked; > > error = expand_files(files, fd); > if (error < 0) > - goto out; > + goto out_locked; > > /* > * If we needed to expand the fs array we > @@ -546,15 +546,15 @@ static int alloc_fd(unsigned start, unsigned > end, unsigned flags) > else > __clear_close_on_exec(fd, fdt); > error = fd; > -#if 1 > - /* Sanity check */ > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > + spin_unlock(&files->file_lock); > + > + if (unlikely(rcu_access_pointer(fdt->fd[fd]) != NULL)) { > printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > - rcu_assign_pointer(fdt->fd[fd], NULL); > } > -#endif Now that I sent it it is of course not safe to deref without protection from either rcu or the lock, so this would have to be wrapped with rcu_read_lock, which makes it even less appealing. Whacking the thing as in the submitted patch seems like the best way forward here. :) > > -out: > + return error; > + > +out_locked: > spin_unlock(&files->file_lock); > return error; > } > > > > > Honza > > > > > Combined with patch 1 and 2 in series, pts/blogbench-1.1.0 read improved by > > > 32%, write improved by 17% on Intel ICX 160 cores configuration with v6.10-rc4. > > > > > > Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> > > > Signed-off-by: Yu Ma <yu.ma@intel.com> > > > --- > > > fs/file.c | 7 ------- > > > 1 file changed, 7 deletions(-) > > > > > > diff --git a/fs/file.c b/fs/file.c > > > index b4d25f6d4c19..1153b0b7ba3d 100644 > > > --- a/fs/file.c > > > +++ b/fs/file.c > > > @@ -555,13 +555,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) > > > else > > > __clear_close_on_exec(fd, fdt); > > > error = fd; > > > -#if 1 > > > - /* Sanity check */ > > > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > > > - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > > > - rcu_assign_pointer(fdt->fd[fd], NULL); > > > - } > > > -#endif > > > > > > out: > > > spin_unlock(&files->file_lock); > > > -- > > > 2.43.0 > > > > > -- > > Jan Kara <jack@suse.com> > > SUSE Labs, CR > > > > -- > Mateusz Guzik <mjguzik gmail.com>
On Tue 25-06-24 15:11:23, Mateusz Guzik wrote: > On Tue, Jun 25, 2024 at 3:09 PM Mateusz Guzik <mjguzik@gmail.com> wrote: > > > > On Tue, Jun 25, 2024 at 2:08 PM Jan Kara <jack@suse.cz> wrote: > > > > > > On Sat 22-06-24 11:49:04, Yu Ma wrote: > > > > alloc_fd() has a sanity check inside to make sure the struct file mapping to the > > > > allocated fd is NULL. Remove this sanity check since it can be assured by > > > > exisitng zero initilization and NULL set when recycling fd. > > > ^^^ existing ^^^ initialization > > > > > > Well, since this is a sanity check, it is expected it never hits. Yet > > > searching the web shows it has hit a few times in the past :). So would > > > wrapping this with unlikely() give a similar performance gain while keeping > > > debugability? If unlikely() does not help, I agree we can remove this since > > > fd_install() actually has the same check: > > > > > > BUG_ON(fdt->fd[fd] != NULL); > > > > > > and there we need the cacheline anyway so performance impact is minimal. > > > Now, this condition in alloc_fd() is nice that it does not take the kernel > > > down so perhaps we could change the BUG_ON to WARN() dumping similar kind > > > of info as alloc_fd()? > > > > > > > Christian suggested just removing it. > > > > To my understanding the problem is not the branch per se, but the the > > cacheline bounce of the fd array induced by reading the status. > > > > Note the thing also nullifies the pointer, kind of defeating the > > BUG_ON in fd_install. > > > > I'm guessing it's not going to hurt to branch on it after releasing > > the lock and forego nullifying, more or less: > > diff --git a/fs/file.c b/fs/file.c > > index a3b72aa64f11..d22b867db246 100644 > > --- a/fs/file.c > > +++ b/fs/file.c > > @@ -524,11 +524,11 @@ static int alloc_fd(unsigned start, unsigned > > end, unsigned flags) > > */ > > error = -EMFILE; > > if (fd >= end) > > - goto out; > > + goto out_locked; > > > > error = expand_files(files, fd); > > if (error < 0) > > - goto out; > > + goto out_locked; > > > > /* > > * If we needed to expand the fs array we > > @@ -546,15 +546,15 @@ static int alloc_fd(unsigned start, unsigned > > end, unsigned flags) > > else > > __clear_close_on_exec(fd, fdt); > > error = fd; > > -#if 1 > > - /* Sanity check */ > > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > > + spin_unlock(&files->file_lock); > > + > > + if (unlikely(rcu_access_pointer(fdt->fd[fd]) != NULL)) { > > printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > > - rcu_assign_pointer(fdt->fd[fd], NULL); > > } > > -#endif > > Now that I sent it it is of course not safe to deref without > protection from either rcu or the lock, so this would have to be > wrapped with rcu_read_lock, which makes it even less appealing. > > Whacking the thing as in the submitted patch seems like the best way > forward here. :) Yeah, as I wrote, I'm fine removing it, in particular if Christian is of the same opinion. I was more musing about whether we should make the check in fd_install() less aggressive since it is now more likely to trigger... Honza
On Tue, Jun 25, 2024 at 03:30:31PM GMT, Jan Kara wrote: > On Tue 25-06-24 15:11:23, Mateusz Guzik wrote: > > On Tue, Jun 25, 2024 at 3:09 PM Mateusz Guzik <mjguzik@gmail.com> wrote: > > > > > > On Tue, Jun 25, 2024 at 2:08 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Sat 22-06-24 11:49:04, Yu Ma wrote: > > > > > alloc_fd() has a sanity check inside to make sure the struct file mapping to the > > > > > allocated fd is NULL. Remove this sanity check since it can be assured by > > > > > exisitng zero initilization and NULL set when recycling fd. > > > > ^^^ existing ^^^ initialization > > > > > > > > Well, since this is a sanity check, it is expected it never hits. Yet > > > > searching the web shows it has hit a few times in the past :). So would > > > > wrapping this with unlikely() give a similar performance gain while keeping > > > > debugability? If unlikely() does not help, I agree we can remove this since > > > > fd_install() actually has the same check: > > > > > > > > BUG_ON(fdt->fd[fd] != NULL); > > > > > > > > and there we need the cacheline anyway so performance impact is minimal. > > > > Now, this condition in alloc_fd() is nice that it does not take the kernel > > > > down so perhaps we could change the BUG_ON to WARN() dumping similar kind > > > > of info as alloc_fd()? > > > > > > > > > > Christian suggested just removing it. > > > > > > To my understanding the problem is not the branch per se, but the the > > > cacheline bounce of the fd array induced by reading the status. > > > > > > Note the thing also nullifies the pointer, kind of defeating the > > > BUG_ON in fd_install. > > > > > > I'm guessing it's not going to hurt to branch on it after releasing > > > the lock and forego nullifying, more or less: > > > diff --git a/fs/file.c b/fs/file.c > > > index a3b72aa64f11..d22b867db246 100644 > > > --- a/fs/file.c > > > +++ b/fs/file.c > > > @@ -524,11 +524,11 @@ static int alloc_fd(unsigned start, unsigned > > > end, unsigned flags) > > > */ > > > error = -EMFILE; > > > if (fd >= end) > > > - goto out; > > > + goto out_locked; > > > > > > error = expand_files(files, fd); > > > if (error < 0) > > > - goto out; > > > + goto out_locked; > > > > > > /* > > > * If we needed to expand the fs array we > > > @@ -546,15 +546,15 @@ static int alloc_fd(unsigned start, unsigned > > > end, unsigned flags) > > > else > > > __clear_close_on_exec(fd, fdt); > > > error = fd; > > > -#if 1 > > > - /* Sanity check */ > > > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > > > + spin_unlock(&files->file_lock); > > > + > > > + if (unlikely(rcu_access_pointer(fdt->fd[fd]) != NULL)) { > > > printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > > > - rcu_assign_pointer(fdt->fd[fd], NULL); > > > } > > > -#endif > > > > Now that I sent it it is of course not safe to deref without > > protection from either rcu or the lock, so this would have to be > > wrapped with rcu_read_lock, which makes it even less appealing. > > > > Whacking the thing as in the submitted patch seems like the best way > > forward here. :) > > Yeah, as I wrote, I'm fine removing it, in particular if Christian is of > the same opinion. I was more musing about whether we should make the check > in fd_install() less aggressive since it is now more likely to trigger... We could change it to WARN_ON() and then people can get BUG_ON() behavior when they turn WARN into BUG which apparently is a thing that we support.
diff --git a/fs/file.c b/fs/file.c index b4d25f6d4c19..1153b0b7ba3d 100644 --- a/fs/file.c +++ b/fs/file.c @@ -555,13 +555,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) else __clear_close_on_exec(fd, fdt); error = fd; -#if 1 - /* Sanity check */ - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); - rcu_assign_pointer(fdt->fd[fd], NULL); - } -#endif out: spin_unlock(&files->file_lock);