diff mbox series

[05/21] libmultipath: lookup_binding: add comment about the algorithm

Message ID 20230901180235.23980-6-mwilck@suse.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show
Series multipath-tools: user-friendly names rework | expand

Commit Message

Martin Wilck Sept. 1, 2023, 6:02 p.m. UTC
From: Martin Wilck <mwilck@suse.com>

When I read this code, I always get confused. Adding comments to
explain the algorithm.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/alias.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

Comments

Benjamin Marzinski Sept. 6, 2023, 10:43 p.m. UTC | #1
On Fri, Sep 01, 2023 at 08:02:18PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> When I read this code, I always get confused. Adding comments to
> explain the algorithm.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/alias.c | 35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/libmultipath/alias.c b/libmultipath/alias.c
> index f7834d1..e61eb91 100644
> --- a/libmultipath/alias.c
> +++ b/libmultipath/alias.c
> @@ -172,6 +172,41 @@ lookup_binding(FILE *f, const char *map_wwid, char **map_alias,
>  		alias = strtok_r(buf, " \t", &saveptr);
>  		if (!alias) /* blank line */
>  			continue;
> +
> +		/*
> +		 * Find an unused index - explanation of the algorithm
> +		 *
> +		 * ID: 1 = mpatha, 2 = mpathb, ...
> +		 *
> +		 * We assume the bindings are unsorted. The only constraint
> +		 * is that no ID occurs more than once. IDs that occur in the
> +		 * bindings are called "used".
> +		 *
> +		 * We call the list 1,2,3,..., exactly in this order, the list
> +		 * of "expected" IDs. The variable "id" always holds the next
> +		 * "expected" ID, IOW the last "expected" ID encountered plus 1.
> +		 * Thus all IDs below "id" are known to be used. However, at the
> +		 * end of the loop, the value of "id" isn't necessarily unused.
> +		 *
> +		 * "smallest_bigger_id" is the smallest used ID that was
> +		 * encountered while it was larger than the next "expected" ID
> +		 * at that iteration. Let X be some used ID. If all IDs below X
> +		 * are used and encountered in the right sequence before X, "id"
> +		 * will be > X when the loop ends. Otherwise, X was encountered
> +		 * "out of order", the condition (X > id) holds when X is
> +		 * encountered, and "smallest_bigger_id" will be set to X; i.e.
> +		 * it will be less or equal than X when the loop ends.
> +		 *
> +		 * At the end of the loop, (id < smallest_bigger_id) means that
> +		 * the value of "id" had been encountered neither in order nor
> +		 * out of order, and is thus unused. (id >= smallest_bigger_id)

I know the check is (id >= smallest_bigger_id), but as long as no ID
occurs more than once, id can never actually be bigger than
smallest_bigger_id since id only gets incremented when (curr_id == id)
and if smallest_bigger_id is not INT_MAX, then smallest_bigger_id
already occured once in the file before id was incremented to equal it.
This means it can't occur again, so id can never get incremented past
it. Not this this really matters, so

Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>

> +		 * means that "id"'s value is in use. In this case, we play safe
> +		 * and use "biggest_id + 1" as the next value to try.
> +		 *
> +		 * biggest_id is always > smallest_bigger_id, except in the
> +		 * "perfectly ordered" case.
> +		 */
> +
>  		curr_id = scan_devname(alias, prefix);
>  		if (curr_id == id) {
>  			if (id < INT_MAX)
> -- 
> 2.41.0
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Martin Wilck Sept. 7, 2023, 8:22 a.m. UTC | #2
On Wed, 2023-09-06 at 17:43 -0500, Benjamin Marzinski wrote:
> On Fri, Sep 01, 2023 at 08:02:18PM +0200, mwilck@suse.com wrote:
> > From: Martin Wilck <mwilck@suse.com>
> > 
> > When I read this code, I always get confused. Adding comments to
> > explain the algorithm.
> > 
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  libmultipath/alias.c | 35 +++++++++++++++++++++++++++++++++++
> >  1 file changed, 35 insertions(+)
> > 
> > diff --git a/libmultipath/alias.c b/libmultipath/alias.c
> > index f7834d1..e61eb91 100644
> > --- a/libmultipath/alias.c
> > +++ b/libmultipath/alias.c
> > @@ -172,6 +172,41 @@ lookup_binding(FILE *f, const char *map_wwid,
> > char **map_alias,
> >                 alias = strtok_r(buf, " \t", &saveptr);
> >                 if (!alias) /* blank line */
> >                         continue;
> > +
> > +               /*
> > +                * Find an unused index - explanation of the
> > algorithm
> > +                *
> > +                * ID: 1 = mpatha, 2 = mpathb, ...
> > +                *
> > +                * We assume the bindings are unsorted. The only
> > constraint
> > +                * is that no ID occurs more than once. IDs that
> > occur in the
> > +                * bindings are called "used".
> > +                *
> > +                * We call the list 1,2,3,..., exactly in this
> > order, the list
> > +                * of "expected" IDs. The variable "id" always
> > holds the next
> > +                * "expected" ID, IOW the last "expected" ID
> > encountered plus 1.
> > +                * Thus all IDs below "id" are known to be used.
> > However, at the
> > +                * end of the loop, the value of "id" isn't
> > necessarily unused.
> > +                *
> > +                * "smallest_bigger_id" is the smallest used ID
> > that was
> > +                * encountered while it was larger than the next
> > "expected" ID
> > +                * at that iteration. Let X be some used ID. If all
> > IDs below X
> > +                * are used and encountered in the right sequence
> > before X, "id"
> > +                * will be > X when the loop ends. Otherwise, X was
> > encountered
> > +                * "out of order", the condition (X > id) holds
> > when X is
> > +                * encountered, and "smallest_bigger_id" will be
> > set to X; i.e.
> > +                * it will be less or equal than X when the loop
> > ends.
> > +                *
> > +                * At the end of the loop, (id <
> > smallest_bigger_id) means that
> > +                * the value of "id" had been encountered neither
> > in order nor
> > +                * out of order, and is thus unused. (id >=
> > smallest_bigger_id)
> 
> I know the check is (id >= smallest_bigger_id), but as long as no ID
> occurs more than once, id can never actually be bigger than
> smallest_bigger_id since id only gets incremented when (curr_id ==
> id)
> and if smallest_bigger_id is not INT_MAX, then smallest_bigger_id
> already occured once in the file before id was incremented to equal
> it.
> This means it can't occur again, so id can never get incremented past
> it. Not this this really matters, so

Right. Like you, I thought this didn't matter for the algorithm as
such, and that explaining it would make the lengthy explanation even
lengthier.

Btw, if you have a simpler explanation of this algorithm, please
provide it ;-)

Martin

> 
> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> 
> > +                * means that "id"'s value is in use. In this case,
> > we play safe
> > +                * and use "biggest_id + 1" as the next value to
> > try.
> > +                *
> > +                * biggest_id is always > smallest_bigger_id,
> > except in the
> > +                * "perfectly ordered" case.
> > +                */
> > +
> >                 curr_id = scan_devname(alias, prefix);
> >                 if (curr_id == id) {
> >                         if (id < INT_MAX)
> > -- 
> > 2.41.0
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/libmultipath/alias.c b/libmultipath/alias.c
index f7834d1..e61eb91 100644
--- a/libmultipath/alias.c
+++ b/libmultipath/alias.c
@@ -172,6 +172,41 @@  lookup_binding(FILE *f, const char *map_wwid, char **map_alias,
 		alias = strtok_r(buf, " \t", &saveptr);
 		if (!alias) /* blank line */
 			continue;
+
+		/*
+		 * Find an unused index - explanation of the algorithm
+		 *
+		 * ID: 1 = mpatha, 2 = mpathb, ...
+		 *
+		 * We assume the bindings are unsorted. The only constraint
+		 * is that no ID occurs more than once. IDs that occur in the
+		 * bindings are called "used".
+		 *
+		 * We call the list 1,2,3,..., exactly in this order, the list
+		 * of "expected" IDs. The variable "id" always holds the next
+		 * "expected" ID, IOW the last "expected" ID encountered plus 1.
+		 * Thus all IDs below "id" are known to be used. However, at the
+		 * end of the loop, the value of "id" isn't necessarily unused.
+		 *
+		 * "smallest_bigger_id" is the smallest used ID that was
+		 * encountered while it was larger than the next "expected" ID
+		 * at that iteration. Let X be some used ID. If all IDs below X
+		 * are used and encountered in the right sequence before X, "id"
+		 * will be > X when the loop ends. Otherwise, X was encountered
+		 * "out of order", the condition (X > id) holds when X is
+		 * encountered, and "smallest_bigger_id" will be set to X; i.e.
+		 * it will be less or equal than X when the loop ends.
+		 *
+		 * At the end of the loop, (id < smallest_bigger_id) means that
+		 * the value of "id" had been encountered neither in order nor
+		 * out of order, and is thus unused. (id >= smallest_bigger_id)
+		 * means that "id"'s value is in use. In this case, we play safe
+		 * and use "biggest_id + 1" as the next value to try.
+		 *
+		 * biggest_id is always > smallest_bigger_id, except in the
+		 * "perfectly ordered" case.
+		 */
+
 		curr_id = scan_devname(alias, prefix);
 		if (curr_id == id) {
 			if (id < INT_MAX)