mbox series

[v2,0/4] dm pr_ops fixes

Message ID 20220717224508.10404-1-michael.christie@oracle.com (mailing list archive)
Headers show
Series dm pr_ops fixes | expand

Message

Mike Christie July 17, 2022, 10:45 p.m. UTC
The following patches were made over Linus's tree and fix a couple bugs
in the pr_ops code when a reservation type other than one of the All
Registrants types is used. They were tested with the Windows failover
cluster verification tests and libiscsi's PGR tests.

The current dm pr_ops code works well for All Registrants because any
registered path is the reservation holder. Commands like reserve and
release can go down any path and the behavior is the same. The problems
these patches fix is when only one path is the holder as is the case
for the other reservation types which is used by Window Failover Cluster
and Linux Cluster (tools like pacemaker + scsi/multipath_fence agents).
For example for Registrants Only the path that got the RESERVE command is
the reservation holder. The RELEASE must be sent down that path to release
the reservation.

With our current design we send down non-registration PR commands down
whatever path we are currenly using, and then later PR commands end
up on different paths. To continue the current design where dm's pr_ops
are just passing through requests, and to avoid adding PR state to dm
these patches modify pr_reserve/release to work similar to pr_register
where we loop over all paths or at least loop over all paths until we
find the path we are looking for.

v2:
- Added info about testing.
- Added patch for pr_preempt.



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

Comments

Mike Snitzer July 22, 2022, 7:22 p.m. UTC | #1
On Sun, Jul 17 2022 at  6:45P -0400,
Mike Christie <michael.christie@oracle.com> wrote:

> The following patches were made over Linus's tree and fix a couple bugs
> in the pr_ops code when a reservation type other than one of the All
> Registrants types is used. They were tested with the Windows failover
> cluster verification tests and libiscsi's PGR tests.
> 
> The current dm pr_ops code works well for All Registrants because any
> registered path is the reservation holder. Commands like reserve and
> release can go down any path and the behavior is the same. The problems
> these patches fix is when only one path is the holder as is the case
> for the other reservation types which is used by Window Failover Cluster
> and Linux Cluster (tools like pacemaker + scsi/multipath_fence agents).
> For example for Registrants Only the path that got the RESERVE command is
> the reservation holder. The RELEASE must be sent down that path to release
> the reservation.
> 
> With our current design we send down non-registration PR commands down
> whatever path we are currenly using, and then later PR commands end
> up on different paths. To continue the current design where dm's pr_ops
> are just passing through requests, and to avoid adding PR state to dm
> these patches modify pr_reserve/release to work similar to pr_register
> where we loop over all paths or at least loop over all paths until we
> find the path we are looking for.
> 
> v2:
> - Added info about testing.
> - Added patch for pr_preempt.

I picked this set up for 5.20 and staged in linux-next.

I tweaked the patch headers a bit while proof-reading and
understanding the scope of the changes.

I noticed that dm_pr_clear is the only remaining dm_pr_* method that
is using dm_{prepare,unprepare}_ioctl. I assume that'll be fine, but
the one gap it leaves is handling for the possibility that the DM
device is suspended.  Shouldn't dm_call_pr() be enhanced to check:
if (dm_suspended_md(md)) ?

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel