diff mbox series

submodule: separate out not-found and not-empty errors

Message ID YZQ5Zk0ItWvfr8sF@fedora19.localdomain (mailing list archive)
State New, archived
Headers show
Series submodule: separate out not-found and not-empty errors | expand

Commit Message

Ian Wienand Nov. 16, 2021, 11:06 p.m. UTC
After upgrading past 505a2765963 a long-working script to cache git
repos started failing with

 Could not access submodule '...'

for every updated submodule on each fetch [1].

Ultimately this turned out to be using "--git-dir=" from outside the
repo; i.e. we really wanted "-C" in this script (the man page does
warn about this -- but it was working for a long time).

Although obvious in hindsight, this was very difficult to diagnose
from the error message.  It required me adding debugging to these
functions to determine why it was falling into this path when
everything looked right on disk.

This proposes separate messages for the directory missing v. being
present but having unexpected contents.  Both messages are modified to
give the path that is being examined.

[1] https://review.opendev.org/c/openstack/diskimage-builder/+/818053

Signed-off-by: Ian Wienand <iwienand@redhat.com>
---
 submodule.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Junio C Hamano Nov. 17, 2021, 9:39 a.m. UTC | #1
Ian Wienand <iwienand@redhat.com> writes:

>  			    !is_empty_dir(empty_submodule_path.buf)) {
>  				spf->result = 1;
> -				strbuf_addf(err,
> -					    _("Could not access submodule '%s'\n"),
> -					    ce->name);
> +				/* is_empty_dir also catches missing dirtectories, but report separately */
> +				if (!is_directory(empty_submodule_path.buf)) {

I was hoping that inspecting errno after is_empty_dir() returned
might be sufficient (of course, we need to clear errno before
calling is_empty_dir() if we go that route), but because this is an
error codepath that we do not need to optimize, a call to
is_directory() that incurs another system call would be fine.

> +				  strbuf_addf(err,
> +					      _("Submodule directory '%s' not found (incorrect --git-dir?)\n"),

"not found" is something the code definitely knows (eh, not quite,
but let's read on).  

But let's not make an uninformed guess.  This code didn't even check
if the user gave a --git-dir option.

If the user is advanced enough to have given "--git-dir", "not found"
should be sufficient to hint that the way the user specified the
repository location incorrectly, and a wrong "--git-dir" might be
one of the many things the user might suspect on their own.

Another problem with the message is !is_directory() can mean "there
is no filesystem entity at the path" (i.e. "submodule directory '%s'
does not exist") and it can also mean "there is a filesystem entity
at the path, but that is not a directory).  "not found" is not exactly
a good message to give in the latter case.

We are giving two messages here in this codepath.  For example, the
original one would have said something like:

	Could not access submodule 'foo'
	Submodule directory 'foo' is not empty

So I suspect that a more appropriate phrasing for the other one (the
new one you added) would be something like

	Could not access submodule 'foo'
	Path to the submodule 'foo' is not a directory

perhaps?

Thanks.


> +					      empty_submodule_path.buf);
> +				} else {
> +				  strbuf_addf(err,
> +					      _("Submodule directory '%s' is not empty\n"),
> +					      empty_submodule_path.buf);
> +				}
>  			}
>  			strbuf_release(&empty_submodule_path);
>  		}
Ian Wienand Nov. 18, 2021, 4:06 a.m. UTC | #2
Thanks for the review!

On Wed, Nov 17, 2021 at 01:39:08AM -0800, Junio C Hamano wrote:
> Ian Wienand <iwienand@redhat.com> writes:
> 
> >  			    !is_empty_dir(empty_submodule_path.buf)) {
> >  				spf->result = 1;
> > -				strbuf_addf(err,
> > -					    _("Could not access submodule '%s'\n"),
> > -					    ce->name);
> > +				/* is_empty_dir also catches missing dirtectories, but report separately */
> > +				if (!is_directory(empty_submodule_path.buf)) {
> 
> I was hoping that inspecting errno after is_empty_dir() returned
> might be sufficient (of course, we need to clear errno before
> calling is_empty_dir() if we go that route), but because this is an
> error codepath that we do not need to optimize, a call to
> is_directory() that incurs another system call would be fine.

That was my thinking too, I do admit it's a bit of an obscure case to
end up in, but the regular path doesn't need any more checking.

> > +				  strbuf_addf(err,
> > +					      _("Submodule directory '%s' not found (incorrect --git-dir?)\n"),
> 
> "not found" is something the code definitely knows (eh, not quite,
> but let's read on).  
> 
> But let's not make an uninformed guess.  This code didn't even check
> if the user gave a --git-dir option.
> 
> If the user is advanced enough to have given "--git-dir", "not found"
> should be sufficient to hint that the way the user specified the
> repository location incorrectly, and a wrong "--git-dir" might be
> one of the many things the user might suspect on their own.

Ok, I can drop that bit, as you say it's a guess.

> Another problem with the message is !is_directory() can mean "there
> is no filesystem entity at the path" (i.e. "submodule directory '%s'
> does not exist") and it can also mean "there is a filesystem entity
> at the path, but that is not a directory).  "not found" is not exactly
> a good message to give in the latter case.
> 
> We are giving two messages here in this codepath.  For example, the
> original one would have said something like:
> 
> 	Could not access submodule 'foo'
> 	Submodule directory 'foo' is not empty

Actually in the original patch I dropped the "Could not access..."
bit, but I agree it makes more sense with the more specific details
added.  I've put that back and append the secondary message below.

> So I suspect that a more appropriate phrasing for the other one (the
> new one you added) would be something like
> 
> 	Could not access submodule 'foo'
> 	Path to the submodule 'foo' is not a directory
> 
> perhaps?

Done

-i

From 6ea39c3e43eccf93e4567329cfb1b9c642154283 Mon Sep 17 00:00:00 2001
From: Ian Wienand <iwienand@redhat.com>
Date: Wed, 17 Nov 2021 09:39:40 +1100
Subject: [PATCH] submodule: separate out not-found and not-empty errors

After upgrading past 505a2765963 a long-working script to cache git
repos started failing with

 Could not access submodule '...'

for every updated submodule on each fetch [1].

Ultimately this turned out to be using "--git-dir=" from outside the
repo; i.e. we really wanted "-C" in this script (the man page does
warn about this -- but it was working for a long time).

Although obvious in hindsight, this was very difficult to diagnose
from the error message.  It required me adding debugging to these
functions to determine why it was falling into this path when
everything looked right on disk.

This proposes separate messages for the directory missing v. being
present but having unexpected contents.  Both messages are modified to
give the path that is being examined.

[1] https://review.opendev.org/c/openstack/diskimage-builder/+/818053

Signed-off-by: Ian Wienand <iwienand@redhat.com>
---
 submodule.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/submodule.c b/submodule.c
index c689070524..352ee50f2e 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1524,6 +1524,16 @@ static int get_next_submodule(struct child_process *cp,
 				strbuf_addf(err,
 					    _("Could not access submodule '%s'\n"),
 					    ce->name);
+				/* is_empty_dir also catches missing directories, but report separately */
+				if (!is_directory(empty_submodule_path.buf)) {
+				  strbuf_addf(err,
+					      _("Submodule path '%s' is not a directory\n"),
+					      empty_submodule_path.buf);
+				} else {
+				  strbuf_addf(err,
+					      _("Submodule directory '%s' is not empty\n"),
+					      empty_submodule_path.buf);
+				}
 			}
 			strbuf_release(&empty_submodule_path);
 		}
diff mbox series

Patch

diff --git a/submodule.c b/submodule.c
index c689070524..910ee6ba7d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1521,9 +1521,16 @@  static int get_next_submodule(struct child_process *cp,
 			if (S_ISGITLINK(ce->ce_mode) &&
 			    !is_empty_dir(empty_submodule_path.buf)) {
 				spf->result = 1;
-				strbuf_addf(err,
-					    _("Could not access submodule '%s'\n"),
-					    ce->name);
+				/* is_empty_dir also catches missing dirtectories, but report separately */
+				if (!is_directory(empty_submodule_path.buf)) {
+				  strbuf_addf(err,
+					      _("Submodule directory '%s' not found (incorrect --git-dir?)\n"),
+					      empty_submodule_path.buf);
+				} else {
+				  strbuf_addf(err,
+					      _("Submodule directory '%s' is not empty\n"),
+					      empty_submodule_path.buf);
+				}
 			}
 			strbuf_release(&empty_submodule_path);
 		}