mbox series

[0/8] multipath-tools: max_sectors_kb rework

Message ID 20240417184644.6193-1-mwilck@suse.com (mailing list archive)
Headers show
Series multipath-tools: max_sectors_kb rework | expand

Message

Martin Wilck April 17, 2024, 6:46 p.m. UTC
This patch series addresses the issue described in [1].

Changing max_sectors_kb on a live multipath map is dangerous. In particular,
no path device may have a lower value than the map at any time, unless the
device is suspended and all I/O has been flushed. It is also dangerous to
read a max_sectors_kb value from sysfs and write it back, because this may
actually cause the max_sectors value in the kernel (which is in 512 byte block
units, not in kiB) to be rounded down to the next even number.

This patch set avoids both these problematic techniques. max_sectors_kb is
only changed for maps that have the respective configuration parameter set,
and only during map creation or during path addition. In the latter case,
the limit is set on the new path before actually adding it to the map, thus
avoiding a change on a live path. While it is generally recommended to set
max_sectors_kb once and for all before creating a map, the latter technique
can be used to change the value on a live map, by removing and re-adding
a path device.

A side effect of this patch set is that not necessarily all path devices of
a map that has max_sectors_kb set will have this limit configured. In
particular, if max_sectors_kb is enforced by adding a path device, only the
added path and the map itself will have the new max_sectors_kb value, because
multipathd will not attempt to change the value of the other paths, which
can therefore be higher than the configured limit. This is not a problem.
Errors arise only if the limit of the map device is higher than that of
any path device.

This patch set does not attempt to address issues between a multipath device
and block layers stacked on to of it (e.g. LVM or MD raid), as discussed in
the commit message of 8fd4868 ("libmultipath: don't set max_sectors_kb on
reloads"). I think, and my experiments have confirmed it, that for bio-based
stacked devices on top of multipath, max_sectors_kb is not a problem. bios
with a large amount of I/O are split in the kernel to fit into the lower-level
device's limits. Only request-based dm aka multipath is susceptible to errors
caused by max_sectors_kb mismatch. If I am overlooking something here, please
provide an example to prove me wrong.

https://lore.kernel.org/dm-devel/7742003e19b5a49398067dc0c59dfa8ddeffc3d7.camel@suse.com/

Martin Wilck (8):
  multipath-tools: add TGTDIR option
  libmultipath: move get_udev_for_mpp to sysfs.c
  libmultipath: add mp_find_path_by_devt()
  Revert "libmultipath: fix max_sectors_kb on adding path"
  libmultipath: Only set max_sectors_kb on map creation
  libmultipath: set max_sectors_kb in adopt_paths()
  libmultipath: add wildcard %k for printing max_sectors_kb
  multipath.conf(5): update documentation for max_sectors_kb

 .github/actions/spelling/expect.txt |  2 +
 Makefile.inc                        |  8 ++--
 README.md                           | 33 +++++++++++++++
 libmultipath/configure.c            | 63 +++--------------------------
 libmultipath/print.c                | 38 +++++++++++++++++
 libmultipath/structs.c              | 19 +++++++++
 libmultipath/structs.h              |  3 ++
 libmultipath/structs_vec.c          | 62 +++++++++++++++++++++++++---
 libmultipath/structs_vec.h          |  6 ++-
 libmultipath/sysfs.c                | 19 +++++++++
 libmultipath/sysfs.h                |  4 ++
 multipath/multipath.conf.5.in       | 33 +++++++++++++--
 multipathd/main.c                   |  6 +--
 tests/Makefile                      |  2 +-
 tests/test-lib.c                    |  9 ++++-
 15 files changed, 229 insertions(+), 78 deletions(-)

Comments

Benjamin Marzinski April 18, 2024, 6:40 p.m. UTC | #1
On Wed, Apr 17, 2024 at 08:46:36PM +0200, Martin Wilck wrote:
> This patch series addresses the issue described in [1].

For the Set:
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>


> Changing max_sectors_kb on a live multipath map is dangerous. In particular,
> no path device may have a lower value than the map at any time, unless the
> device is suspended and all I/O has been flushed. It is also dangerous to
> read a max_sectors_kb value from sysfs and write it back, because this may
> actually cause the max_sectors value in the kernel (which is in 512 byte block
> units, not in kiB) to be rounded down to the next even number.
> 
> This patch set avoids both these problematic techniques. max_sectors_kb is
> only changed for maps that have the respective configuration parameter set,
> and only during map creation or during path addition. In the latter case,
> the limit is set on the new path before actually adding it to the map, thus
> avoiding a change on a live path. While it is generally recommended to set
> max_sectors_kb once and for all before creating a map, the latter technique
> can be used to change the value on a live map, by removing and re-adding
> a path device.
> 
> A side effect of this patch set is that not necessarily all path devices of
> a map that has max_sectors_kb set will have this limit configured. In
> particular, if max_sectors_kb is enforced by adding a path device, only the
> added path and the map itself will have the new max_sectors_kb value, because
> multipathd will not attempt to change the value of the other paths, which
> can therefore be higher than the configured limit. This is not a problem.
> Errors arise only if the limit of the map device is higher than that of
> any path device.
> 
> This patch set does not attempt to address issues between a multipath device
> and block layers stacked on to of it (e.g. LVM or MD raid), as discussed in
> the commit message of 8fd4868 ("libmultipath: don't set max_sectors_kb on
> reloads"). I think, and my experiments have confirmed it, that for bio-based
> stacked devices on top of multipath, max_sectors_kb is not a problem. bios
> with a large amount of I/O are split in the kernel to fit into the lower-level
> device's limits. Only request-based dm aka multipath is susceptible to errors
> caused by max_sectors_kb mismatch. If I am overlooking something here, please
> provide an example to prove me wrong.
> 
> https://lore.kernel.org/dm-devel/7742003e19b5a49398067dc0c59dfa8ddeffc3d7.camel@suse.com/

I reran the bug reproducer for original issue that inspired 8fd4868,
after manually increasing max_sectors_kb of the device on top of
multipathd, and it works fine on recent kernels.
 
> Martin Wilck (8):
>   multipath-tools: add TGTDIR option
>   libmultipath: move get_udev_for_mpp to sysfs.c
>   libmultipath: add mp_find_path_by_devt()
>   Revert "libmultipath: fix max_sectors_kb on adding path"
>   libmultipath: Only set max_sectors_kb on map creation
>   libmultipath: set max_sectors_kb in adopt_paths()
>   libmultipath: add wildcard %k for printing max_sectors_kb
>   multipath.conf(5): update documentation for max_sectors_kb
> 
>  .github/actions/spelling/expect.txt |  2 +
>  Makefile.inc                        |  8 ++--
>  README.md                           | 33 +++++++++++++++
>  libmultipath/configure.c            | 63 +++--------------------------
>  libmultipath/print.c                | 38 +++++++++++++++++
>  libmultipath/structs.c              | 19 +++++++++
>  libmultipath/structs.h              |  3 ++
>  libmultipath/structs_vec.c          | 62 +++++++++++++++++++++++++---
>  libmultipath/structs_vec.h          |  6 ++-
>  libmultipath/sysfs.c                | 19 +++++++++
>  libmultipath/sysfs.h                |  4 ++
>  multipath/multipath.conf.5.in       | 33 +++++++++++++--
>  multipathd/main.c                   |  6 +--
>  tests/Makefile                      |  2 +-
>  tests/test-lib.c                    |  9 ++++-
>  15 files changed, 229 insertions(+), 78 deletions(-)
> 
> -- 
> 2.44.0