Message ID | 20250409143138.568173-1-colin.i.king@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | select: do_pollfd: add unlikely branch hint return path | expand |
On Wed, Apr 09, 2025 at 03:31:38PM +0100, Colin Ian King wrote: > Adding an unlikely() hint on the fd < 0 comparison return path improves > run-time performance of the mincore system call. gcov based coverage > analysis shows that this path return path is highly unlikely. > > Benchmarking on an Debian based Intel(R) Core(TM) Ultra 9 285K with > a 6.15-rc1 kernel and a poll of 1024 file descriptors with zero timeout > shows an call reduction from 32818 ns down to 32635 ns, which is a ~0.5% > performance improvement. > > Results based on running 25 tests with turbo disabled (to reduce clock > freq turbo changes), with 30 second run per test and comparing the number > of poll() calls per second. The % standard deviation of the 25 tests > was 0.08%, so results are reliable. > I don't think adding a branch hint warrants benchmarking of the sort. Instead the thing to do is to check if the prediction matches real world uses. While it is impossible to check this for all programs out there, it should not be a significant time investment to look to check some of the popular ones out there. Normally I would do it with bpftrace, but this comes from a user-backed area instead of func args, so involved hackery may be needed which is not warranted the change. Perhaps running strace on a bunch of network progs would also do it (ssh, browser?). I have to say I did not even know one can legally pass a fd < 0 to poll and I never seen it in action, so I don't expect many users. ;)
On 09/04/2025 16:18, Mateusz Guzik wrote: > On Wed, Apr 09, 2025 at 03:31:38PM +0100, Colin Ian King wrote: >> Adding an unlikely() hint on the fd < 0 comparison return path improves >> run-time performance of the mincore system call. gcov based coverage >> analysis shows that this path return path is highly unlikely. >> >> Benchmarking on an Debian based Intel(R) Core(TM) Ultra 9 285K with >> a 6.15-rc1 kernel and a poll of 1024 file descriptors with zero timeout >> shows an call reduction from 32818 ns down to 32635 ns, which is a ~0.5% >> performance improvement. >> >> Results based on running 25 tests with turbo disabled (to reduce clock >> freq turbo changes), with 30 second run per test and comparing the number >> of poll() calls per second. The % standard deviation of the 25 tests >> was 0.08%, so results are reliable. >> > > I don't think adding a branch hint warrants benchmarking of the sort. > > Instead the thing to do is to check if the prediction matches real world > uses. > > While it is impossible to check this for all programs out there, it > should not be a significant time investment to look to check some of the > popular ones out there. Normally I would do it with bpftrace, but this > comes from a user-backed area instead of func args, so involved hackery > may be needed which is not warranted the change. Perhaps running strace > on a bunch of network progs would also do it (ssh, browser?). > > I have to say I did not even know one can legally pass a fd < 0 to poll > and I never seen it in action, so I don't expect many users. ;) I did check this based on gcov coverage (mentioned in the commit message) and this is based on running gcov data from running stress-ng and kernel builds and I've been looking for branch hint performance wins based on the top 250 if statements that are not already hinted using likely/unlikely. Colin
On Wed, Apr 9, 2025 at 5:23 PM Colin King (gmail) <colin.i.king@gmail.com> wrote: > > On 09/04/2025 16:18, Mateusz Guzik wrote: > > On Wed, Apr 09, 2025 at 03:31:38PM +0100, Colin Ian King wrote: > >> Adding an unlikely() hint on the fd < 0 comparison return path improves > >> run-time performance of the mincore system call. gcov based coverage > >> analysis shows that this path return path is highly unlikely. > >> > >> Benchmarking on an Debian based Intel(R) Core(TM) Ultra 9 285K with > >> a 6.15-rc1 kernel and a poll of 1024 file descriptors with zero timeout > >> shows an call reduction from 32818 ns down to 32635 ns, which is a ~0.5% > >> performance improvement. > >> > >> Results based on running 25 tests with turbo disabled (to reduce clock > >> freq turbo changes), with 30 second run per test and comparing the number > >> of poll() calls per second. The % standard deviation of the 25 tests > >> was 0.08%, so results are reliable. > >> > > > > I don't think adding a branch hint warrants benchmarking of the sort. > > > > Instead the thing to do is to check if the prediction matches real world > > uses. > > > > While it is impossible to check this for all programs out there, it > > should not be a significant time investment to look to check some of the > > popular ones out there. Normally I would do it with bpftrace, but this > > comes from a user-backed area instead of func args, so involved hackery > > may be needed which is not warranted the change. Perhaps running strace > > on a bunch of network progs would also do it (ssh, browser?). > > > > I have to say I did not even know one can legally pass a fd < 0 to poll > > and I never seen it in action, so I don't expect many users. ;) > > I did check this based on gcov coverage (mentioned in the commit > message) and this is based on running gcov data from running stress-ng > and kernel builds and I've been looking for branch hint performance wins > based on the top 250 if statements that are not already hinted using > likely/unlikely. > Well now that you mention it, the commit message claims *mincore*. :) I presume not edited from a different submission. You did not specify what you fed to gcov in there though. The kernel build is fine. However, stress-ng intentionally does weird things and can't be used as an argument here. It just may or may not accidentally line up with reality and any wins/loses have to be analyzed for legitimacy. I just straced ssh and firefox, both of which poll and only with non-negative fds [that I have seen], so I think the patch is fine.
diff --git a/fs/select.c b/fs/select.c index 7da531b1cf6b..0eaf3522abe9 100644 --- a/fs/select.c +++ b/fs/select.c @@ -857,7 +857,7 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait, int fd = pollfd->fd; __poll_t mask, filter; - if (fd < 0) + if (unlikely(fd < 0)) return 0; CLASS(fd, f)(fd);
Adding an unlikely() hint on the fd < 0 comparison return path improves run-time performance of the mincore system call. gcov based coverage analysis shows that this path return path is highly unlikely. Benchmarking on an Debian based Intel(R) Core(TM) Ultra 9 285K with a 6.15-rc1 kernel and a poll of 1024 file descriptors with zero timeout shows an call reduction from 32818 ns down to 32635 ns, which is a ~0.5% performance improvement. Results based on running 25 tests with turbo disabled (to reduce clock freq turbo changes), with 30 second run per test and comparing the number of poll() calls per second. The % standard deviation of the 25 tests was 0.08%, so results are reliable. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> --- fs/select.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)