Message ID | 165445911199.1664.12318094116152634589.stgit@bazille.1015granger.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix NFSv3 READDIRPLUS failures | expand |
On Mon, 06 Jun 2022, Chuck Lever wrote: > The xdr_get_next_encode_buffer() function is called infrequently. > On a typical build/test workload, it's called about 1 time in 400 > calls to xdr_reserve_space() (measured on NFSD). > > Force the compiler to remove it from xdr_reserve_space(), which is > a hot path. This change reduces the size of xdr_reserve_space() from > 10 cache lines to 4 on my test system. Does this really help at all? Are the instructions that are executed in the common case distributed over those 10 cache line, or are they all in the first 4? I would have thought the "unlikely" in xdr_reserve_space() would have pushed all the code from xdr_get_next_encode_buffer() to the end of the function leaving the remainder in a small contiguous chunk. NeilBrown > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > --- > net/sunrpc/xdr.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c > index b57cf9df4de8..08a85375b311 100644 > --- a/net/sunrpc/xdr.c > +++ b/net/sunrpc/xdr.c > @@ -945,8 +945,13 @@ inline void xdr_commit_encode(struct xdr_stream *xdr) > } > EXPORT_SYMBOL_GPL(xdr_commit_encode); > > -static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, > - size_t nbytes) > +/* > + * The buffer space to be reserved crosses the boundary between > + * xdr->buf->head and xdr->buf->pages, or between two pages > + * in xdr->buf->pages. > + */ > +static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, > + size_t nbytes) > { > __be32 *p; > int space_left; > > >
> On Jun 6, 2022, at 9:03 PM, NeilBrown <neilb@suse.de> wrote: > > On Mon, 06 Jun 2022, Chuck Lever wrote: >> The xdr_get_next_encode_buffer() function is called infrequently. >> On a typical build/test workload, it's called about 1 time in 400 >> calls to xdr_reserve_space() (measured on NFSD). >> >> Force the compiler to remove it from xdr_reserve_space(), which is >> a hot path. This change reduces the size of xdr_reserve_space() from >> 10 cache lines to 4 on my test system. > > Does this really help at all? Are the instructions that are executed in > the common case distributed over those 10 cache line, or are they all in > the first 4? > > I would have thought the "unlikely" in xdr_reserve_space() would have > pushed all the code from xdr_get_next_encode_buffer() to the end of the > function leaving the remainder in a small contiguous chunk. Well, granted that I'm compiling with -Os, not -O2. The compiler inlines xdr_get_next_encode_buffer() right in the middle of xdr_reserve_space(). > NeilBrown > > >> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >> --- >> net/sunrpc/xdr.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c >> index b57cf9df4de8..08a85375b311 100644 >> --- a/net/sunrpc/xdr.c >> +++ b/net/sunrpc/xdr.c >> @@ -945,8 +945,13 @@ inline void xdr_commit_encode(struct xdr_stream *xdr) >> } >> EXPORT_SYMBOL_GPL(xdr_commit_encode); >> >> -static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, >> - size_t nbytes) >> +/* >> + * The buffer space to be reserved crosses the boundary between >> + * xdr->buf->head and xdr->buf->pages, or between two pages >> + * in xdr->buf->pages. >> + */ >> +static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, >> + size_t nbytes) >> { >> __be32 *p; >> int space_left; >> >> >> -- Chuck Lever
On Tue, 07 Jun 2022, Chuck Lever III wrote: > > > On Jun 6, 2022, at 9:03 PM, NeilBrown <neilb@suse.de> wrote: > > > > On Mon, 06 Jun 2022, Chuck Lever wrote: > >> The xdr_get_next_encode_buffer() function is called infrequently. > >> On a typical build/test workload, it's called about 1 time in 400 > >> calls to xdr_reserve_space() (measured on NFSD). > >> > >> Force the compiler to remove it from xdr_reserve_space(), which is > >> a hot path. This change reduces the size of xdr_reserve_space() from > >> 10 cache lines to 4 on my test system. > > > > Does this really help at all? Are the instructions that are executed in > > the common case distributed over those 10 cache line, or are they all in > > the first 4? > > > > I would have thought the "unlikely" in xdr_reserve_space() would have > > pushed all the code from xdr_get_next_encode_buffer() to the end of the > > function leaving the remainder in a small contiguous chunk. > > Well, granted that I'm compiling with -Os, not -O2. The compiler inlines > xdr_get_next_encode_buffer() right in the middle of xdr_reserve_space(). Interesting. I tried with -O2 and it move xdr_get_next_encode_buffer() to the end, but inlined xdr_commit_encode() in the middle. I changed xdr_commit_encode() to wrap the "shift==0" test in likely(), and then it produced a more reasonable result. With your noinline patch, the "return xdr_get_next_encode_buffer()" becomes a jump, not a jump-to-subroutine, so there is little cost in it. Might I suggest: Move the "xdr->scratch.iov_len" test out of xdr_commit_encode() and put it in both callers as "unlikely". Mark both xdr_commit_encode and xdr_get_next_encode_buffer() as noinline mention in the commit message that with -Os the "unlikely" in xdr_reserve_space() doesn't help ?? Thanks, NeilBrown > > > > NeilBrown > > > > > >> > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > >> --- > >> net/sunrpc/xdr.c | 9 +++++++-- > >> 1 file changed, 7 insertions(+), 2 deletions(-) > >> > >> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c > >> index b57cf9df4de8..08a85375b311 100644 > >> --- a/net/sunrpc/xdr.c > >> +++ b/net/sunrpc/xdr.c > >> @@ -945,8 +945,13 @@ inline void xdr_commit_encode(struct xdr_stream *xdr) > >> } > >> EXPORT_SYMBOL_GPL(xdr_commit_encode); > >> > >> -static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, > >> - size_t nbytes) > >> +/* > >> + * The buffer space to be reserved crosses the boundary between > >> + * xdr->buf->head and xdr->buf->pages, or between two pages > >> + * in xdr->buf->pages. > >> + */ > >> +static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, > >> + size_t nbytes) > >> { > >> __be32 *p; > >> int space_left; > >> > >> > >> > > -- > Chuck Lever > > > >
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index b57cf9df4de8..08a85375b311 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -945,8 +945,13 @@ inline void xdr_commit_encode(struct xdr_stream *xdr) } EXPORT_SYMBOL_GPL(xdr_commit_encode); -static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, - size_t nbytes) +/* + * The buffer space to be reserved crosses the boundary between + * xdr->buf->head and xdr->buf->pages, or between two pages + * in xdr->buf->pages. + */ +static noinline __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr, + size_t nbytes) { __be32 *p; int space_left;
The xdr_get_next_encode_buffer() function is called infrequently. On a typical build/test workload, it's called about 1 time in 400 calls to xdr_reserve_space() (measured on NFSD). Force the compiler to remove it from xdr_reserve_space(), which is a hot path. This change reduces the size of xdr_reserve_space() from 10 cache lines to 4 on my test system. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> --- net/sunrpc/xdr.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)