mbox series

[RFC,net-next,0/3] txhash: Make hash rethink configurable and change the default

Message ID 20210809185314.38187-1-tom@herbertland.com (mailing list archive)
Headers show
Series txhash: Make hash rethink configurable and change the default | expand

Message

Tom Herbert Aug. 9, 2021, 6:53 p.m. UTC
Alexander Azimov performed some nice analysis of the feature in Linux
stack where the IPv6 flow label is changed when the stack detects a
connection is failing. The idea of the algorithm is to try to find a
better path. His reults are quite impressive, and show that this form
of source routing can work effectively.

Alex raised an issue in that if the server endpoint is an IP anycast
address, the connection might break if the flow label changes routing
of packets on the connection. Anycast is known to be susceptible to
route changes, not just those caused be flow label. The concern is that
flow label modulation might increases the chances that anycast
connections might break, especially if the rethink occurs after just
one RTO which is the current behavior.

This patch set makes the rethink behavior granular and configurable.
It allows control of when to do the hash rethink: upon negative advice,
at RTO in SYN state, at RTO when not in SYN state. The behavior can
be configured by sysctl and by a socket option.

This patch set the defautl rethink behavior to be to do a rethink only
on negative advice. This is reverts back to the original behavior of
the hash rethink mechanism. This less aggressive with the intent of
mitigating potentail breakages when anycast addresses are present.
For those users that are benefitting from changing the hash at the
first RTO, they would retain that behavior by setting the sysctl.
*** BLURB HERE ***

Tom Herbert (3):
  txhash: Make rethinking txhash behavior configurable via sysctl
  txhash: Add socket option to control TX hash rethink behavior
  txhash: Change default rethink behavior to be less aggressive

 arch/alpha/include/uapi/asm/socket.h  |  2 ++
 arch/mips/include/uapi/asm/socket.h   |  2 ++
 arch/parisc/include/uapi/asm/socket.h |  2 ++
 arch/sparc/include/uapi/asm/socket.h  |  3 ++-
 include/net/netns/core.h              |  2 ++
 include/net/sock.h                    | 32 +++++++++++++++++++--------
 include/uapi/asm-generic/socket.h     |  2 ++
 include/uapi/linux/socket.h           | 13 +++++++++++
 net/core/net_namespace.c              |  4 ++++
 net/core/sock.c                       | 16 ++++++++++++++
 net/core/sysctl_net_core.c            |  7 ++++++
 net/ipv4/tcp_input.c                  |  2 +-
 net/ipv4/tcp_timer.c                  |  5 ++++-
 13 files changed, 80 insertions(+), 12 deletions(-)

Comments

Yuchung Cheng Aug. 9, 2021, 9:56 p.m. UTC | #1
On Mon, Aug 9, 2021 at 11:53 AM Tom Herbert <tom@herbertland.com> wrote:
>
> Alexander Azimov performed some nice analysis of the feature in Linux
> stack where the IPv6 flow label is changed when the stack detects a
> connection is failing. The idea of the algorithm is to try to find a
> better path. His reults are quite impressive, and show that this form
> of source routing can work effectively.
>
> Alex raised an issue in that if the server endpoint is an IP anycast
> address, the connection might break if the flow label changes routing
> of packets on the connection. Anycast is known to be susceptible to
> route changes, not just those caused be flow label. The concern is that
> flow label modulation might increases the chances that anycast
> connections might break, especially if the rethink occurs after just
> one RTO which is the current behavior.
>
> This patch set makes the rethink behavior granular and configurable.
> It allows control of when to do the hash rethink: upon negative advice,
> at RTO in SYN state, at RTO when not in SYN state. The behavior can
> be configured by sysctl and by a socket option.
>
> This patch set the defautl rethink behavior to be to do a rethink only
> on negative advice. This is reverts back to the original behavior of
> the hash rethink mechanism. This less aggressive with the intent of
Thanks for offering knobs to the txhash mechanism.

Any reason why reverting the default behavior (that was changed in
2013) is necessary? systems now rely on this RTO tx-rehash to work
around link failures will now have to manually re-enable it. Some
users may have to learn from higher connection failures to eventually
identify this kernel change.

> mitigating potentail breakages when anycast addresses are present.> For those users that are benefitting from changing the hash at the
> first RTO, they would retain that behavior by setting the sysctl.
> *** BLURB HERE ***
>
> Tom Herbert (3):
>   txhash: Make rethinking txhash behavior configurable via sysctl
>   txhash: Add socket option to control TX hash rethink behavior
>   txhash: Change default rethink behavior to be less aggressive
>
>  arch/alpha/include/uapi/asm/socket.h  |  2 ++
>  arch/mips/include/uapi/asm/socket.h   |  2 ++
>  arch/parisc/include/uapi/asm/socket.h |  2 ++
>  arch/sparc/include/uapi/asm/socket.h  |  3 ++-
>  include/net/netns/core.h              |  2 ++
>  include/net/sock.h                    | 32 +++++++++++++++++++--------
>  include/uapi/asm-generic/socket.h     |  2 ++
>  include/uapi/linux/socket.h           | 13 +++++++++++
>  net/core/net_namespace.c              |  4 ++++
>  net/core/sock.c                       | 16 ++++++++++++++
>  net/core/sysctl_net_core.c            |  7 ++++++
>  net/ipv4/tcp_input.c                  |  2 +-
>  net/ipv4/tcp_timer.c                  |  5 ++++-
>  13 files changed, 80 insertions(+), 12 deletions(-)
>
> --
> 2.25.1
>
Yuchung Cheng Aug. 9, 2021, 9:58 p.m. UTC | #2
On Mon, Aug 9, 2021 at 2:56 PM Yuchung Cheng <ycheng@google.com> wrote:
>
> On Mon, Aug 9, 2021 at 11:53 AM Tom Herbert <tom@herbertland.com> wrote:
> >
> > Alexander Azimov performed some nice analysis of the feature in Linux
> > stack where the IPv6 flow label is changed when the stack detects a
> > connection is failing. The idea of the algorithm is to try to find a
> > better path. His reults are quite impressive, and show that this form
> > of source routing can work effectively.
> >
> > Alex raised an issue in that if the server endpoint is an IP anycast
> > address, the connection might break if the flow label changes routing
> > of packets on the connection. Anycast is known to be susceptible to
> > route changes, not just those caused be flow label. The concern is that
> > flow label modulation might increases the chances that anycast
> > connections might break, especially if the rethink occurs after just
> > one RTO which is the current behavior.
> >
> > This patch set makes the rethink behavior granular and configurable.
> > It allows control of when to do the hash rethink: upon negative advice,
> > at RTO in SYN state, at RTO when not in SYN state. The behavior can
> > be configured by sysctl and by a socket option.
> >
> > This patch set the defautl rethink behavior to be to do a rethink only
> > on negative advice. This is reverts back to the original behavior of
> > the hash rethink mechanism. This less aggressive with the intent of
> Thanks for offering knobs to the txhash mechanism.
>
> Any reason why reverting the default behavior (that was changed in
> 2013) is necessary? systems now rely on this RTO tx-rehash to work
> around link failures will now have to manually re-enable it. Some
> users may have to learn from higher connection failures to eventually
> identify this kernel change.
Just to be clear: I agree we should offer knobs to change the txhash
behavior so the first parts of this set looks good to me. I am only
concerned about the default behavior reversal.

>
> > mitigating potentail breakages when anycast addresses are present.> For those users that are benefitting from changing the hash at the
> > first RTO, they would retain that behavior by setting the sysctl.
> > *** BLURB HERE ***
> >
> > Tom Herbert (3):
> >   txhash: Make rethinking txhash behavior configurable via sysctl
> >   txhash: Add socket option to control TX hash rethink behavior
> >   txhash: Change default rethink behavior to be less aggressive
> >
> >  arch/alpha/include/uapi/asm/socket.h  |  2 ++
> >  arch/mips/include/uapi/asm/socket.h   |  2 ++
> >  arch/parisc/include/uapi/asm/socket.h |  2 ++
> >  arch/sparc/include/uapi/asm/socket.h  |  3 ++-
> >  include/net/netns/core.h              |  2 ++
> >  include/net/sock.h                    | 32 +++++++++++++++++++--------
> >  include/uapi/asm-generic/socket.h     |  2 ++
> >  include/uapi/linux/socket.h           | 13 +++++++++++
> >  net/core/net_namespace.c              |  4 ++++
> >  net/core/sock.c                       | 16 ++++++++++++++
> >  net/core/sysctl_net_core.c            |  7 ++++++
> >  net/ipv4/tcp_input.c                  |  2 +-
> >  net/ipv4/tcp_timer.c                  |  5 ++++-
> >  13 files changed, 80 insertions(+), 12 deletions(-)
> >
> > --
> > 2.25.1
> >