Message ID | 20230901180235.23980-6-mwilck@suse.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | christophe varoqui |
Headers | show |
Series | multipath-tools: user-friendly names rework | expand |
On Fri, Sep 01, 2023 at 08:02:18PM +0200, mwilck@suse.com wrote: > From: Martin Wilck <mwilck@suse.com> > > When I read this code, I always get confused. Adding comments to > explain the algorithm. > > Signed-off-by: Martin Wilck <mwilck@suse.com> > --- > libmultipath/alias.c | 35 +++++++++++++++++++++++++++++++++++ > 1 file changed, 35 insertions(+) > > diff --git a/libmultipath/alias.c b/libmultipath/alias.c > index f7834d1..e61eb91 100644 > --- a/libmultipath/alias.c > +++ b/libmultipath/alias.c > @@ -172,6 +172,41 @@ lookup_binding(FILE *f, const char *map_wwid, char **map_alias, > alias = strtok_r(buf, " \t", &saveptr); > if (!alias) /* blank line */ > continue; > + > + /* > + * Find an unused index - explanation of the algorithm > + * > + * ID: 1 = mpatha, 2 = mpathb, ... > + * > + * We assume the bindings are unsorted. The only constraint > + * is that no ID occurs more than once. IDs that occur in the > + * bindings are called "used". > + * > + * We call the list 1,2,3,..., exactly in this order, the list > + * of "expected" IDs. The variable "id" always holds the next > + * "expected" ID, IOW the last "expected" ID encountered plus 1. > + * Thus all IDs below "id" are known to be used. However, at the > + * end of the loop, the value of "id" isn't necessarily unused. > + * > + * "smallest_bigger_id" is the smallest used ID that was > + * encountered while it was larger than the next "expected" ID > + * at that iteration. Let X be some used ID. If all IDs below X > + * are used and encountered in the right sequence before X, "id" > + * will be > X when the loop ends. Otherwise, X was encountered > + * "out of order", the condition (X > id) holds when X is > + * encountered, and "smallest_bigger_id" will be set to X; i.e. > + * it will be less or equal than X when the loop ends. > + * > + * At the end of the loop, (id < smallest_bigger_id) means that > + * the value of "id" had been encountered neither in order nor > + * out of order, and is thus unused. (id >= smallest_bigger_id) I know the check is (id >= smallest_bigger_id), but as long as no ID occurs more than once, id can never actually be bigger than smallest_bigger_id since id only gets incremented when (curr_id == id) and if smallest_bigger_id is not INT_MAX, then smallest_bigger_id already occured once in the file before id was incremented to equal it. This means it can't occur again, so id can never get incremented past it. Not this this really matters, so Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > + * means that "id"'s value is in use. In this case, we play safe > + * and use "biggest_id + 1" as the next value to try. > + * > + * biggest_id is always > smallest_bigger_id, except in the > + * "perfectly ordered" case. > + */ > + > curr_id = scan_devname(alias, prefix); > if (curr_id == id) { > if (id < INT_MAX) > -- > 2.41.0 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
On Wed, 2023-09-06 at 17:43 -0500, Benjamin Marzinski wrote: > On Fri, Sep 01, 2023 at 08:02:18PM +0200, mwilck@suse.com wrote: > > From: Martin Wilck <mwilck@suse.com> > > > > When I read this code, I always get confused. Adding comments to > > explain the algorithm. > > > > Signed-off-by: Martin Wilck <mwilck@suse.com> > > --- > > libmultipath/alias.c | 35 +++++++++++++++++++++++++++++++++++ > > 1 file changed, 35 insertions(+) > > > > diff --git a/libmultipath/alias.c b/libmultipath/alias.c > > index f7834d1..e61eb91 100644 > > --- a/libmultipath/alias.c > > +++ b/libmultipath/alias.c > > @@ -172,6 +172,41 @@ lookup_binding(FILE *f, const char *map_wwid, > > char **map_alias, > > alias = strtok_r(buf, " \t", &saveptr); > > if (!alias) /* blank line */ > > continue; > > + > > + /* > > + * Find an unused index - explanation of the > > algorithm > > + * > > + * ID: 1 = mpatha, 2 = mpathb, ... > > + * > > + * We assume the bindings are unsorted. The only > > constraint > > + * is that no ID occurs more than once. IDs that > > occur in the > > + * bindings are called "used". > > + * > > + * We call the list 1,2,3,..., exactly in this > > order, the list > > + * of "expected" IDs. The variable "id" always > > holds the next > > + * "expected" ID, IOW the last "expected" ID > > encountered plus 1. > > + * Thus all IDs below "id" are known to be used. > > However, at the > > + * end of the loop, the value of "id" isn't > > necessarily unused. > > + * > > + * "smallest_bigger_id" is the smallest used ID > > that was > > + * encountered while it was larger than the next > > "expected" ID > > + * at that iteration. Let X be some used ID. If all > > IDs below X > > + * are used and encountered in the right sequence > > before X, "id" > > + * will be > X when the loop ends. Otherwise, X was > > encountered > > + * "out of order", the condition (X > id) holds > > when X is > > + * encountered, and "smallest_bigger_id" will be > > set to X; i.e. > > + * it will be less or equal than X when the loop > > ends. > > + * > > + * At the end of the loop, (id < > > smallest_bigger_id) means that > > + * the value of "id" had been encountered neither > > in order nor > > + * out of order, and is thus unused. (id >= > > smallest_bigger_id) > > I know the check is (id >= smallest_bigger_id), but as long as no ID > occurs more than once, id can never actually be bigger than > smallest_bigger_id since id only gets incremented when (curr_id == > id) > and if smallest_bigger_id is not INT_MAX, then smallest_bigger_id > already occured once in the file before id was incremented to equal > it. > This means it can't occur again, so id can never get incremented past > it. Not this this really matters, so Right. Like you, I thought this didn't matter for the algorithm as such, and that explaining it would make the lengthy explanation even lengthier. Btw, if you have a simpler explanation of this algorithm, please provide it ;-) Martin > > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > > > + * means that "id"'s value is in use. In this case, > > we play safe > > + * and use "biggest_id + 1" as the next value to > > try. > > + * > > + * biggest_id is always > smallest_bigger_id, > > except in the > > + * "perfectly ordered" case. > > + */ > > + > > curr_id = scan_devname(alias, prefix); > > if (curr_id == id) { > > if (id < INT_MAX) > > -- > > 2.41.0 > -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
diff --git a/libmultipath/alias.c b/libmultipath/alias.c index f7834d1..e61eb91 100644 --- a/libmultipath/alias.c +++ b/libmultipath/alias.c @@ -172,6 +172,41 @@ lookup_binding(FILE *f, const char *map_wwid, char **map_alias, alias = strtok_r(buf, " \t", &saveptr); if (!alias) /* blank line */ continue; + + /* + * Find an unused index - explanation of the algorithm + * + * ID: 1 = mpatha, 2 = mpathb, ... + * + * We assume the bindings are unsorted. The only constraint + * is that no ID occurs more than once. IDs that occur in the + * bindings are called "used". + * + * We call the list 1,2,3,..., exactly in this order, the list + * of "expected" IDs. The variable "id" always holds the next + * "expected" ID, IOW the last "expected" ID encountered plus 1. + * Thus all IDs below "id" are known to be used. However, at the + * end of the loop, the value of "id" isn't necessarily unused. + * + * "smallest_bigger_id" is the smallest used ID that was + * encountered while it was larger than the next "expected" ID + * at that iteration. Let X be some used ID. If all IDs below X + * are used and encountered in the right sequence before X, "id" + * will be > X when the loop ends. Otherwise, X was encountered + * "out of order", the condition (X > id) holds when X is + * encountered, and "smallest_bigger_id" will be set to X; i.e. + * it will be less or equal than X when the loop ends. + * + * At the end of the loop, (id < smallest_bigger_id) means that + * the value of "id" had been encountered neither in order nor + * out of order, and is thus unused. (id >= smallest_bigger_id) + * means that "id"'s value is in use. In this case, we play safe + * and use "biggest_id + 1" as the next value to try. + * + * biggest_id is always > smallest_bigger_id, except in the + * "perfectly ordered" case. + */ + curr_id = scan_devname(alias, prefix); if (curr_id == id) { if (id < INT_MAX)