Message ID | 3464545fe3feceb08408618c77a70cc95745bfe9.1707857541.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | FSMonitor edge cases on case-insensitive file systems | expand |
"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Jeff Hostetler <jeffhostetler@github.com> > > Create a new version of index_dir_exists() to return the canonical > spelling of the matched directory prefix. > > The existing index_dir_exists() returns a boolean to indicate if > there is a case-insensitive match in the directory name-hash, but > it doesn't tell the caller the exact spelling of that match. > > The new version also copies the matched spelling to a provided strbuf. > This lets the caller, for example, then call index_name_pos() with the > correct case to search the cache-entry array for the real insertion > position. > > Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> > --- > name-hash.c | 16 ++++++++++++++++ > name-hash.h | 2 ++ > 2 files changed, 18 insertions(+) > > diff --git a/name-hash.c b/name-hash.c > index 251f036eef6..d735c81acb3 100644 > --- a/name-hash.c > +++ b/name-hash.c > @@ -694,6 +694,22 @@ int index_dir_exists(struct index_state *istate, const char *name, int namelen) > dir = find_dir_entry(istate, name, namelen); > return dir && dir->nr; > } > +int index_dir_exists2(struct index_state *istate, const char *name, int namelen, > + struct strbuf *canonical_path) > +{ > + struct dir_entry *dir; > + > + strbuf_init(canonical_path, namelen+1); > + > + lazy_init_name_hash(istate); > + expand_to_path(istate, name, namelen, 0); > + dir = find_dir_entry(istate, name, namelen); > + > + if (dir && dir->nr) > + strbuf_add(canonical_path, dir->name, dir->namelen); > + > + return dir && dir->nr; > +} > > void adjust_dirname_case(struct index_state *istate, char *name) Missing inter-function blank line, before the new function. I wonder if we can avoid such repetition---the body of index_dir_exists() is 100% shared with this new function. Isn't it extremely unusual to receive "struct strbuf *" and call strbuf_init() on it? It means that the caller is expected to have a strbuf and pass a pointer to it, but also it is expected to leave the strbuf uninitialized. I'd understand if it calls strbuf_reset(), but it may not even be necessary, if we make it responsibility of the caller to pass a valid strbuf to be appended into. int index_dir_find(struct index_state *istate, const char *name, int namelen, struct strbuf *canonical_path) { struct dir_entry *dir; lazy_init_name_hash(istate); expand_to_path(istate, name, namelen, 0); dir = find_dir_entry(istate, name, namelen); if (canonical_path && dir && dir->nr) { // strbuf_reset(canonical_path); ??? strbuf_add(canonical_path, dir->name, dir->namelen); } return dir && dir->nr; } Then we can do #define index_dir_exists(i, n, l) index_dir_find((i), (n), (l), NULL) in the header for existing callers. > { > diff --git a/name-hash.h b/name-hash.h > index b1b4b0fb337..2fcac5c4870 100644 > --- a/name-hash.h > +++ b/name-hash.h > @@ -5,6 +5,8 @@ struct cache_entry; > struct index_state; > > int index_dir_exists(struct index_state *istate, const char *name, int namelen); > +int index_dir_exists2(struct index_state *istate, const char *name, int namelen, > + struct strbuf *canonical_path); > void adjust_dirname_case(struct index_state *istate, char *name); > struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase);
On Tue, Feb 13, 2024 at 08:52:11PM +0000, Jeff Hostetler via GitGitGadget wrote: > From: Jeff Hostetler <jeffhostetler@github.com> > > Create a new version of index_dir_exists() to return the canonical > spelling of the matched directory prefix. > > The existing index_dir_exists() returns a boolean to indicate if > there is a case-insensitive match in the directory name-hash, but > it doesn't tell the caller the exact spelling of that match. > > The new version also copies the matched spelling to a provided strbuf. > This lets the caller, for example, then call index_name_pos() with the > correct case to search the cache-entry array for the real insertion > position. > > Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> > --- > name-hash.c | 16 ++++++++++++++++ > name-hash.h | 2 ++ > 2 files changed, 18 insertions(+) > > diff --git a/name-hash.c b/name-hash.c > index 251f036eef6..d735c81acb3 100644 > --- a/name-hash.c > +++ b/name-hash.c > @@ -694,6 +694,22 @@ int index_dir_exists(struct index_state *istate, const char *name, int namelen) > dir = find_dir_entry(istate, name, namelen); > return dir && dir->nr; > } > +int index_dir_exists2(struct index_state *istate, const char *name, int namelen, > + struct strbuf *canonical_path) > +{ > + struct dir_entry *dir; > + > + strbuf_init(canonical_path, namelen+1); Missing spaces: `namelen + 1`. > + lazy_init_name_hash(istate); > + expand_to_path(istate, name, namelen, 0); > + dir = find_dir_entry(istate, name, namelen); > + > + if (dir && dir->nr) > + strbuf_add(canonical_path, dir->name, dir->namelen); > + > + return dir && dir->nr; > +} Can we maybe give this function a more descriptive name? `index_dir_exists2()` doesn't give the reader any indicator what is different about it compared to `index_dir_exists()`. How about `index_dir_exists_with_canonical()`? > void adjust_dirname_case(struct index_state *istate, char *name) > { > diff --git a/name-hash.h b/name-hash.h > index b1b4b0fb337..2fcac5c4870 100644 > --- a/name-hash.h > +++ b/name-hash.h > @@ -5,6 +5,8 @@ struct cache_entry; > struct index_state; > > int index_dir_exists(struct index_state *istate, const char *name, int namelen); > +int index_dir_exists2(struct index_state *istate, const char *name, int namelen, > + struct strbuf *canonical_path); It would also be great to add comments here that explain what those functions do and what the difference between them is. Patrick > void adjust_dirname_case(struct index_state *istate, char *name); > struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase); > > -- > gitgitgadget > >
On 2/13/24 4:43 PM, Junio C Hamano wrote: > "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Jeff Hostetler <jeffhostetler@github.com> >> >> Create a new version of index_dir_exists() to return the canonical >> spelling of the matched directory prefix. >> >> The existing index_dir_exists() returns a boolean to indicate if >> there is a case-insensitive match in the directory name-hash, but >> it doesn't tell the caller the exact spelling of that match. >> >> The new version also copies the matched spelling to a provided strbuf. >> This lets the caller, for example, then call index_name_pos() with the >> correct case to search the cache-entry array for the real insertion >> position. >> >> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> >> --- >> name-hash.c | 16 ++++++++++++++++ >> name-hash.h | 2 ++ >> 2 files changed, 18 insertions(+) >> >> diff --git a/name-hash.c b/name-hash.c >> index 251f036eef6..d735c81acb3 100644 >> --- a/name-hash.c >> +++ b/name-hash.c >> @@ -694,6 +694,22 @@ int index_dir_exists(struct index_state *istate, const char *name, int namelen) >> dir = find_dir_entry(istate, name, namelen); >> return dir && dir->nr; >> } >> +int index_dir_exists2(struct index_state *istate, const char *name, int namelen, >> + struct strbuf *canonical_path) >> +{ >> + struct dir_entry *dir; >> + >> + strbuf_init(canonical_path, namelen+1); >> + >> + lazy_init_name_hash(istate); >> + expand_to_path(istate, name, namelen, 0); >> + dir = find_dir_entry(istate, name, namelen); >> + >> + if (dir && dir->nr) >> + strbuf_add(canonical_path, dir->name, dir->namelen); >> + >> + return dir && dir->nr; >> +} >> >> void adjust_dirname_case(struct index_state *istate, char *name) > > Missing inter-function blank line, before the new function. > > I wonder if we can avoid such repetition---the body of > index_dir_exists() is 100% shared with this new function. > > Isn't it extremely unusual to receive "struct strbuf *" and call > strbuf_init() on it? It means that the caller is expected to have a > strbuf and pass a pointer to it, but also it is expected to leave > the strbuf uninitialized. > > I'd understand if it calls strbuf_reset(), but it may not even be > necessary, if we make it responsibility of the caller to pass a > valid strbuf to be appended into. > > int index_dir_find(struct index_state *istate, > const char *name, int namelen, > struct strbuf *canonical_path) > { > struct dir_entry *dir; > > lazy_init_name_hash(istate); > expand_to_path(istate, name, namelen, 0); > dir = find_dir_entry(istate, name, namelen); > > if (canonical_path && dir && dir->nr) { > // strbuf_reset(canonical_path); ??? > strbuf_add(canonical_path, dir->name, dir->namelen); > } > return dir && dir->nr; > } > > Then we can do > > #define index_dir_exists(i, n, l) index_dir_find((i), (n), (l), NULL) > > in the header for existing callers. > I'm always a little hesitant to change the signature of an existing function and chasing all of the callers in the middle of another task. It can sometimes be distracting to reviewers. I like your macro approach here. I'll do that in the next version. Thanks Jeff
Jeff Hostetler <git@jeffhostetler.com> writes: > I'm always a little hesitant to change the signature of an existing > function and chasing all of the callers in the middle of another > task. It can sometimes be distracting to reviewers. Of course we all should be hesitant. In addition to reviewers, there are topics in flight and topics people are cooking but not posted that will be affected. So it is perfectly fine to introduce an enhanced version as needed under different name (but let's not give it a meaningless name like "foo2" where it is totally unclear and unexplained what its difference from "foo" is from the name), but if it is meant as an enhanced version, we should aim to share the code and rewrite the original in terms of the enhanced one, instead of simply duplicating to risk unnecessary divergence of the two functions in the future. Thanks.
diff --git a/name-hash.c b/name-hash.c index 251f036eef6..d735c81acb3 100644 --- a/name-hash.c +++ b/name-hash.c @@ -694,6 +694,22 @@ int index_dir_exists(struct index_state *istate, const char *name, int namelen) dir = find_dir_entry(istate, name, namelen); return dir && dir->nr; } +int index_dir_exists2(struct index_state *istate, const char *name, int namelen, + struct strbuf *canonical_path) +{ + struct dir_entry *dir; + + strbuf_init(canonical_path, namelen+1); + + lazy_init_name_hash(istate); + expand_to_path(istate, name, namelen, 0); + dir = find_dir_entry(istate, name, namelen); + + if (dir && dir->nr) + strbuf_add(canonical_path, dir->name, dir->namelen); + + return dir && dir->nr; +} void adjust_dirname_case(struct index_state *istate, char *name) { diff --git a/name-hash.h b/name-hash.h index b1b4b0fb337..2fcac5c4870 100644 --- a/name-hash.h +++ b/name-hash.h @@ -5,6 +5,8 @@ struct cache_entry; struct index_state; int index_dir_exists(struct index_state *istate, const char *name, int namelen); +int index_dir_exists2(struct index_state *istate, const char *name, int namelen, + struct strbuf *canonical_path); void adjust_dirname_case(struct index_state *istate, char *name); struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase);