mbox series

[v5,00/23] path checker refactor and misc fixes

Message ID 20241015032835.2693247-1-bmarzins@redhat.com (mailing list archive)
Headers show
Series path checker refactor and misc fixes | expand

Message

Benjamin Marzinski Oct. 15, 2024, 3:28 a.m. UTC
This patchset is based on discussions I had with Martin Wilck about my
last, partially applied patchset.

The first two patches are reposts of patches from my earlier patchset,
redone to work with the new libmp_mapinfo() API. They make it possible
to add maps by WWID with "multipathd add map".

The rest of the patches are work and bugfixes related to refactoring
checkerloop so that

1. multipath devices get resynced with the kernel occasionally, even if
they have not paths

2. If multiple paths from a multipath device are checked in the same
loop, the multipath device will only get resynced once.

3. All the paths of a multipath device will converge to being checked at
the same time (at least as much as their differing checkint values will
allow).

4. The times for checking the paths of each multipath device will spread
out as much as possible so multipathd doesn't do all of it's checking in
a burst.

5. path checking is done my multipath device (for initialized paths,
the uninitialized paths are handled after all the adopted paths are
handled).

Items 1 & 2 are handled by patch 10 and preceding patches.

Items 3 & 4 and handled by patch 17 and preceding patches.

Item 5 is handled by patch 18 and 19.

Changes in v5 (all from Martin's suggestions):
19: Changed checker variable name from ct->waited_for to
    ct->checked_state
23: New patch to change the directio checker to use timestamps to
    check for timeout.

Changes in v4 (all from Martin's suggestions):
09: Changed commit message and fixed incorrect comment
10: Moved getting checkint from config to the if clause where its used
11 & 12: Moved setting pp->checkint from 12 to 11
14: Added comment about when refresh_all is set.

Changes in v3:
0: adapt to 27053732 'fixup "libmultipath: use libmp_pathinfo() in
    update_multipath_table()"'. This is identical to 1d564df4
    ("multipathd: adjust when mpp is synced with the kernel") on
    mwilck/tip.
11: fix commit message (Martin)
17: use symbolic return code (Martin)

Changes in v2 (old patch number in quotes):
01: change from returning minor number to dm info (Martin)
02: adapt to change in patch 01
05 (05-07): squash into one commit (Martin)
08 (10): fix commit message (Matin)
10 (12): rename do_check_mpp amd check_mpp
11 (13): move the code that saved the path name to right before it was
         removed (Martin)
14 (16): use symbolic return codes (Martin)
15 (17): set path state to PATH_UNCHECKED when orphaned, and skip paths
         in INIT_REMOVED in sync_map_state() (Martin)
17 (19): clear pp->pending_ticks in check_path() if the path is delayed
         instead of pending. Use a different equation to check if we
         need to modify the ticks, that can adjust the ticks on every
         check. adapt to change in patch 14
18 (20): adapt to change in patch 14
19 (21): adapt to change in patch 10 and 14
20 (22): Add fixes trailer

Benjamin Marzinski (23):
  libmultipath: store checker_check() result in checker struct
  libmultipath: add missing checker function prototypes
  libmultipath: split out the code to wait for pending checkers
  libmultipath: remove pending wait code from libcheck_check calls
  multipath-tools tests: fix up directio tests
  libmultipath: split get_state into two functions
  libmultipath: change path_offline to path_sysfs_state
  multipathd: split check_path_state into two functions
  multipathd: rename do_check_path to update_path_state
  multipathd: split check_path into two functions
  multipathd: split handle_uninitialized_path into two functions
  multipathd: split check_paths into two functions
  multipathd: fix "fail path" and "reinstate path" commands
  multipathd: update priority once after updating all paths
  multipathd: simplify checking for followover_should_failback
  multipathd: only refresh prios on PATH_UP and PATH_GHOST
  multipathd: remove pointless check
  multipathd: fix deferred_failback_tick for reload removes
  libmultipath: add libcheck_need_wait checker function
  libmultipath: don't wait in libcheck_pending
  multipathd: wait for checkers to complete
  multipath-tools tests: fix up directio tests
  multipathd: use timestamps to tell when the directio checker timed out

 libmultipath/checkers.c           |  47 +++-
 libmultipath/checkers.h           |  10 +-
 libmultipath/checkers/directio.c  | 177 +++++++++----
 libmultipath/checkers/tur.c       |  79 +++---
 libmultipath/discovery.c          |  97 ++++---
 libmultipath/discovery.h          |   6 +-
 libmultipath/libmultipath.version |   4 +-
 libmultipath/print.c              |   2 +-
 libmultipath/propsel.c            |   1 +
 libmultipath/structs.h            |  21 +-
 multipathd/cli_handlers.c         |   1 +
 multipathd/main.c                 | 409 ++++++++++++++++++------------
 tests/directio.c                  | 182 +++++++++----
 13 files changed, 676 insertions(+), 360 deletions(-)

Comments

Martin Wilck Nov. 3, 2024, 9:07 p.m. UTC | #1
On Mon, 2024-10-14 at 23:28 -0400, Benjamin Marzinski wrote:
> This patchset is based on discussions I had with Martin Wilck about
> my
> last, partially applied patchset.
> 
> The first two patches are reposts of patches from my earlier
> patchset,
> redone to work with the new libmp_mapinfo() API. They make it
> possible
> to add maps by WWID with "multipathd add map".
> 
> The rest of the patches are work and bugfixes related to refactoring
> checkerloop so that
> 
> 1. multipath devices get resynced with the kernel occasionally, even
> if
> they have not paths
> 
> 2. If multiple paths from a multipath device are checked in the same
> loop, the multipath device will only get resynced once.
> 
> 3. All the paths of a multipath device will converge to being checked
> at
> the same time (at least as much as their differing checkint values
> will
> allow).
> 
> 4. The times for checking the paths of each multipath device will
> spread
> out as much as possible so multipathd doesn't do all of it's checking
> in
> a burst.
> 
> 5. path checking is done my multipath device (for initialized paths,
> the uninitialized paths are handled after all the adopted paths are
> handled).
> 
> Items 1 & 2 are handled by patch 10 and preceding patches.
> 
> Items 3 & 4 and handled by patch 17 and preceding patches.
> 
> Item 5 is handled by patch 18 and 19.
> 
> Changes in v5 (all from Martin's suggestions):
> 19: Changed checker variable name from ct->waited_for to
>     ct->checked_state
> 23: New patch to change the directio checker to use timestamps to
>     check for timeout.
> 
> Changes in v4 (all from Martin's suggestions):
> 09: Changed commit message and fixed incorrect comment
> 10: Moved getting checkint from config to the if clause where its
> used
> 11 & 12: Moved setting pp->checkint from 12 to 11
> 14: Added comment about when refresh_all is set.
> 
> Changes in v3:
> 0: adapt to 27053732 'fixup "libmultipath: use libmp_pathinfo() in
>     update_multipath_table()"'. This is identical to 1d564df4
>     ("multipathd: adjust when mpp is synced with the kernel") on
>     mwilck/tip.
> 11: fix commit message (Martin)
> 17: use symbolic return code (Martin)
> 
> Changes in v2 (old patch number in quotes):
> 01: change from returning minor number to dm info (Martin)
> 02: adapt to change in patch 01
> 05 (05-07): squash into one commit (Martin)
> 08 (10): fix commit message (Matin)
> 10 (12): rename do_check_mpp amd check_mpp
> 11 (13): move the code that saved the path name to right before it
> was
>          removed (Martin)
> 14 (16): use symbolic return codes (Martin)
> 15 (17): set path state to PATH_UNCHECKED when orphaned, and skip
> paths
>          in INIT_REMOVED in sync_map_state() (Martin)
> 17 (19): clear pp->pending_ticks in check_path() if the path is
> delayed
>          instead of pending. Use a different equation to check if we
>          need to modify the ticks, that can adjust the ticks on every
>          check. adapt to change in patch 14
> 18 (20): adapt to change in patch 14
> 19 (21): adapt to change in patch 10 and 14
> 20 (22): Add fixes trailer
> 
> Benjamin Marzinski (23):
>   libmultipath: store checker_check() result in checker struct
>   libmultipath: add missing checker function prototypes
>   libmultipath: split out the code to wait for pending checkers
>   libmultipath: remove pending wait code from libcheck_check calls
>   multipath-tools tests: fix up directio tests
>   libmultipath: split get_state into two functions
>   libmultipath: change path_offline to path_sysfs_state
>   multipathd: split check_path_state into two functions
>   multipathd: rename do_check_path to update_path_state
>   multipathd: split check_path into two functions
>   multipathd: split handle_uninitialized_path into two functions
>   multipathd: split check_paths into two functions
>   multipathd: fix "fail path" and "reinstate path" commands
>   multipathd: update priority once after updating all paths
>   multipathd: simplify checking for followover_should_failback
>   multipathd: only refresh prios on PATH_UP and PATH_GHOST
>   multipathd: remove pointless check
>   multipathd: fix deferred_failback_tick for reload removes
>   libmultipath: add libcheck_need_wait checker function
>   libmultipath: don't wait in libcheck_pending
>   multipathd: wait for checkers to complete
>   multipath-tools tests: fix up directio tests
>   multipathd: use timestamps to tell when the directio checker timed
> out
> 
>  libmultipath/checkers.c           |  47 +++-
>  libmultipath/checkers.h           |  10 +-
>  libmultipath/checkers/directio.c  | 177 +++++++++----
>  libmultipath/checkers/tur.c       |  79 +++---
>  libmultipath/discovery.c          |  97 ++++---
>  libmultipath/discovery.h          |   6 +-
>  libmultipath/libmultipath.version |   4 +-
>  libmultipath/print.c              |   2 +-
>  libmultipath/propsel.c            |   1 +
>  libmultipath/structs.h            |  21 +-
>  multipathd/cli_handlers.c         |   1 +
>  multipathd/main.c                 | 409 ++++++++++++++++++----------
> --
>  tests/directio.c                  | 182 +++++++++----
>  13 files changed, 676 insertions(+), 360 deletions(-)
> 

Sorry again for the long delay. I've applied this series, except patch
23/23, to my "queue" branch.

Regards
Martin