Message ID | pull.1227.git.1651324796892.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | builtin/remote.c: teach `-v` to list filters for promisor remotes | expand |
"Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com> writes: > if (remote->url_nr > 0) { > + struct strbuf promisor_config = STRBUF_INIT; > + const char *partial_clone_filter = NULL; > + > + strbuf_addf(&promisor_config, "remote.%s.partialclonefilter", remote->name); > strbuf_addf(&url_buf, "%s (fetch)", remote->url[0]); > + if (!git_config_get_string_tmp(promisor_config.buf, &partial_clone_filter)) > + strbuf_addf(&url_buf, " [%s]", partial_clone_filter); > + > + strbuf_release(&promisor_config); > string_list_append(list, remote->name)->util = > strbuf_detach(&url_buf, NULL); Three comments and a half on the code: - Is it likely that to new readers it would be obvious that what is in the [square brackets] is the list-objects-filter used? When we want to add new kinds of information other than the URL and the list-objects-filter, what is our plan to add them? - The presentation order is <remote-name> then <direction> (fetch or push) and then optionally <list-objects-filter>. (a) shouldn't the output format be described in the doucmentation? (b) does it make sense to append new information like this, or is it more logical to keep the <direction> at the end? - Now url_buf no longer contains the url of the remote, but it still is called url_buf. It is merely a "temporary string" now. Is it a good idea to either rename it, stop reusing the same thing for different purposes, or do something else? - By adding this unconditionally, we would break the scripts that read the output from this command and expect there won't be extra information after the <direction>. It may be a good thing (they are not prepared to see the list-objects-filter, and the breakage may serve as a reminder that they need to update these scripts when they see breakage), or it may be an irritating regression. But stepping back a bit. Why do we want to give this in the "remote -v" output in the first place? When a reader really cares, they can ask "git config" for this extra piece of information. When you have more than one remote, "git remote -v" that gives the URL is a good way to remind which nickname you'd want to give to "git pull" or "git push". If it makes sense to add the extra <list-objects-filtrer> information, that would mean that there are probably two remote nicknames that refer to the same URL (i.e. "remote -v" readers cannot tell them apart without extra information), but how likely is that, I wonder? > diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh > index 4a3778d04a8..bf8f3644d3c 100755 > --- a/t/t5616-partial-clone.sh > +++ b/t/t5616-partial-clone.sh > @@ -49,6 +49,17 @@ test_expect_success 'do partial clone 1' ' > test "$(git -C pc1 config --local remote.origin.partialclonefilter)" = "blob:none" > ' > > +test_expect_success 'filters for promisor remotes is listed by git remote -v' ' > + git clone --filter=blob:none "file://$(pwd)/srv.bare" pc2 && > + git -C pc2 remote -v >out && > + grep "[blob:none]" out && > + > + git -C pc2 config remote.origin.partialCloneFilter object:type=commit && > + git -C pc2 remote -v >out && > + grep "[object:type=commit]" out && > + rm -rf pc2 > +' > + > test_expect_success 'verify that .promisor file contains refs fetched' ' > ls pc1/.git/objects/pack/pack-*.promisor >promisorlist && > test_line_count = 1 promisorlist && > > base-commit: 0f828332d5ac36fc63b7d8202652efa152809856
Sorry for the late response. Junio C Hamano <gitster@pobox.com> wrote: > Three comments and a half on the code: > > - Is it likely that to new readers it would be obvious that what is > in the [square brackets] is the list-objects-filter used? When we > want to add new kinds of information other than the URL and the > list-objects-filter, what is our plan to add them? I do think that new readers can easily understand the meaning of the text inside the [square brackets]. These square brackets (with the list-objects-filter inside it) will be shown only if the remote is a promisor remote. So, users who don't use promisor remotes, will not be affected. Those who used the filters can only notice the change. They can easily understand it. In fact, I think it would give them an option to quickly check which are the promisor remotes and which are not. Though this change should be properly documented (which I forgot to add) so that they can be sure about it. > - The presentation order is <remote-name> then <direction> (fetch > or push) and then optionally <list-objects-filter>. > > (a) shouldn't the output format be described in the > doucmentation? > > (b) does it make sense to append new information like this, or > is it more logical to keep the <direction> at the end? Yeah, it should be documented. I forgot it :| Will add it in the next version. I think it is better to keep <list-objects-filter> at the end. Because I think, people first want to check whether the remote is (fetch) or (push). After that, they might want to know about the filter. Another point is that <list-objects-filter> is optional (i.e. only for promisor remotes). It would not make sense to put an optional info in between two permanent info (in this case, <remote-name> and <direction>). It would be difficult for scripts which parse the output of `git remote -v` on the basis of string positions. > - Now url_buf no longer contains the url of the remote, but it still > is called url_buf. It is merely a "temporary string" now. Is it > a good idea to either rename it, stop reusing the same thing for > different purposes, or do something else? Hmm, this can be a subject for discussion. Yes, it is true that the name `url_buf` is not suitable for the additional info it contains ( in the proposed change). I did it to use less memory. I think renaming it to `remote_info_buf` or similar is a better idea. > - By adding this unconditionally, we would break the scripts that > read the output from this command and expect there won't be extra > information after the <direction>. It may be a good thing (they > are not prepared to see the list-objects-filter, and the breakage > may serve as a reminder that they need to update these scripts > when they see breakage), or it may be an irritating regression. I agree. Frankly speaking, I have no counter argument for this. I can tell that the proposed change will be beneficial for the users who use promisor remotes along with other remotes. So, may be we can accept the short term consequences of it. What we can do is we can provide a proper documentation so that if anything bad happen to those scripts, devs can see the documentation and update the scripts accordingly. > But stepping back a bit. > > Why do we want to give this in the "remote -v" output in the first > place? When a reader really cares, they can ask "git config" for > this extra piece of information. When you have more than one > remote, "git remote -v" that gives the URL is a good way to remind > which nickname you'd want to give to "git pull" or "git push". `remote -v` helps users to get the overall idea of the remotes. We can see how many remotes are there, which remote name corresponds to which url etc. That is we can get a summary of remotes. Having that said, does not it make sense to add the extra <list-objects-filter> here? Users can easily understand which are promisor remotes ( along with their filter type) and which are not. Of course, they can use git config for that. But it would be a tidious job to check the the type of remotes (i.e. which are promisor remotes and which are not) one by one. If the user try to search for the promisor remotes in the config file, he/she have to go through the other configuration settings (irrelevant to him/her at that time) to reach the `[remote]` section. Isn't it? > ... If > it makes sense to add the extra <list-objects-filtrer> information, > that would mean that there are probably two remote nicknames that > refer to the same URL (i.e. "remote -v" readers cannot tell them > apart without extra information), but how likely is that, I wonder? I think, having a proper documentation about the new changes is the answer to it. Thanks :)
Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes: > Sorry for the late response. > > Junio C Hamano <gitster@pobox.com> wrote: > >> Three comments and a half on the code: >> >> - Is it likely that to new readers it would be obvious that what is >> in the [square brackets] is the list-objects-filter used? When we >> want to add new kinds of information other than the URL and the >> list-objects-filter, what is our plan to add them? > > I do think that new readers can easily understand the meaning of the > text inside the [square brackets]. These square brackets (with the > list-objects-filter inside it) will be shown only if the remote is > a promisor remote. So, users who don't use promisor remotes, will not > be affected. Those who used the filters can only notice the change. > They can easily understand it. In fact, I think it would give them an > option to quickly check which are the promisor remotes and which are not. > Though this change should be properly documented (which I forgot to > add) so that they can be sure about it. You forgot to answer more important half of the question. It would be easy for you to know what the string inside brackets means because you are so obsessed with the promisor remote to write this patch ;-) But when we need to add even more pieces of information in the future, will it stay so? Can "[some-random-string]" easily be identified as a list-objects-filter by those who do not care particularly about promisor remotes (e.g. those who wanted to see the URL to tell multiple remote nicknames apart) when the line has even more piece of information in the future? At some point, we'd need to either (1) stop adding too many details to avoid cluttering the output line, or (2) start labeling each piece of information to make it easy for the readers to identify which one is which [*]. We need to ask ourselves why now is not that "some point" already. Side note: and the strategy to add new pieces of information need to take the same approach between the two, and that is why we need "what is the plan to add new pieces of information?" answered. > (i.e. which are promisor remotes and which are not) one by one. If the > user try to search for the promisor remotes in the config file, he/she > have to go through the other configuration settings (irrelevant to him/her > at that time) to reach the `[remote]` section. Isn't it? Sorry, but the question does not make much sense to me. Why is a piece of information you get from "git config" irrelevant if you get it in order to figure out what you want to know, i.e. what promisor remote do we rely on? >> ... If >> it makes sense to add the extra <list-objects-filtrer> information, >> that would mean that there are probably two remote nicknames that >> refer to the same URL (i.e. "remote -v" readers cannot tell them >> apart without extra information), but how likely is that, I wonder? > > I think, having a proper documentation about the new changes is the > answer to it. The question is "what can readers achieve by having this extra information in 'remote -v' output". Do you have to duck the question because you cannot answer in a simple sentence, and instead readers must read reams of documentation pages? I doubt it would be that obscure. I wanted to like the patch, the changed text is simple enough, but quite honestly, the lack of clarity in the answers to the most basic "why do we want this? what is this good for? how does this help the users?" questions, I am not yet succeeding to do so. Thanks.
Junio C Hamano <gitster@pobox.com> wrote: > You forgot to answer more important half of the question. It would > be easy for you to know what the string inside brackets means > because you are so obsessed with the promisor remote to write this > patch ;-) But when we need to add even more pieces of information in > the future, will it stay so? Can "[some-random-string]" easily be > identified as a list-objects-filter by those who do not care > particularly about promisor remotes (e.g. those who wanted to see > the URL to tell multiple remote nicknames apart) when the line has > even more piece of information in the future? > > At some point, we'd need to either (1) stop adding too many details > to avoid cluttering the output line, or (2) start labeling each > piece of information to make it easy for the readers to identify > which one is which [*]. We need to ask ourselves why now is not > that "some point" already. > > Side note: and the strategy to add new pieces of information > need to take the same approach between the two, and that is why > we need "what is the plan to add new pieces of information?" > answered. I am sorry if I failed to explain you what I really wanted to mean. Yes, I forgot to answer the last question which is "When we want to add new kinds of information other than the URL and the list-objects-filter, what is our plan to add them?". So let me answer this now. As `-v` flag gives a kind of overall summary of the remotes, users expect that the most important and most basic information should be listed in the output of `remote -v`. So, there may be some other more important informations in the future that we have to add to `remote -v` output. In that case, method (1) would not be a great idea I think (unless a new flag has been created). method (2) is better. > > (i.e. which are promisor remotes and which are not) one by one. If the > > user try to search for the promisor remotes in the config file, he/she > > have to go through the other configuration settings (irrelevant to him/her > > at that time) to reach the `[remote]` section. Isn't it? > > Sorry, but the question does not make much sense to me. Why is a > piece of information you get from "git config" irrelevant if you get > it in order to figure out what you want to know, i.e. what promisor > remote do we rely on? Let me explain what I really meant here - I am guessing that you have no problem with the upper part of that para. If we forget about my proposed change, there are two possible ways to find out the info about promisor remotes - (1) Use `git config --get remote.<remote-name>.partialCloneFilter` This command gives an output only if <remote-name> is a promisor remote. So in case the user forget which one is a promisor remote, he/she has to try this command with each and every <remote-name> to find out which is/are the promisor remote(s). (2) Open the git config file (either manually or by running `git config --edit` In this case, the user has to go through all the settings until the [remote "<remote-name>"] section is found. E.g. let's say below is the config file - [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true ignorecase = true precomposeunicode = true [remote "upstream"] url = https://github.com/git/git.git fetch = +refs/heads/*:refs/remotes/upstream/* [branch "master"] remote = upstream merge = refs/heads/master [remote "origin"] url = https://github.com/Abhra303/git.git fetch = +refs/heads/*:refs/remotes/origin/* partialCloneFilter = blob:none To find out whether "origin" is promisor or not, he has to go to the [remote "origin"] section. Here all the configuations under `[core]`, `[remote "upstream"]` and `[branch "master"] are irrelevant to him/her at that time (because he/she is not interested to know about those configuration settings at that time). The proposed change is simpler compared to the above as it lists down all the remotes along with their list-objects-filter. Another point is that it's important for an user to know which one is a promisor remote and what filter type they use. If we go with the current implementation the output would be let's say - origin <remote-url> (fetch) origin <remote-url> (push) upstream <remote-url> (fetch) upstream <remote-url> (push) By seeing the above output anyone may assume that all the remotes are normal remotes. If the user now try to run `git pull origin` and suddenly he/she discover that some blobs are not downloaded. He/she run the above mentioned (1) command and find that this is a promisor remote! Here `remote -v` didn't warn the user about the origin remote being an promisor remote. Instead it makes him/her assume that all are normal remotes. Providing only these three info (i.e. <remote-name>, <remote-url> and <direction>) is not sufficient - it only shows the half of the picture. > The question is "what can readers achieve by having this extra > information in 'remote -v' output". Do you have to duck the > question because you cannot answer in a simple sentence, and instead > readers must read reams of documentation pages? I doubt it would be > that obscure. Sorry, I misunderstood that section of your first comment. I think I hopefully answered this question in the above portion of this comment. Providing only the three information about remotes is not sufficient as it do not distinguish between promisor remotes and normal remotes. In that sense, it will add simplicity and the user would be much more clear about the remotes(i.e. which is promisor remote and which is not). > I wanted to like the patch, the changed text is simple enough, but > quite honestly, the lack of clarity in the answers to the most basic > "why do we want this? what is this good for? how does this help the > users?" questions, I am not yet succeeding to do so. My bad! Hope I am now able to answer all the questions you asked. Let me know if you still struggle to get my point. Thanks :)
On 01/05/2022 20:38, Abhradeep Chakraborty wrote: > Junio C Hamano <gitster@pobox.com> wrote: > >> You forgot to answer more important half of the question. It would >> be easy for you to know what the string inside brackets means >> because you are so obsessed with the promisor remote to write this >> patch ;-) But when we need to add even more pieces of information in >> the future, will it stay so? Can "[some-random-string]" easily be >> identified as a list-objects-filter by those who do not care >> particularly about promisor remotes (e.g. those who wanted to see >> the URL to tell multiple remote nicknames apart) when the line has >> even more piece of information in the future? >> >> At some point, we'd need to either (1) stop adding too many details >> to avoid cluttering the output line, or (2) start labeling each >> piece of information to make it easy for the readers to identify >> which one is which [*]. We need to ask ourselves why now is not >> that "some point" already. >> >> Side note: and the strategy to add new pieces of information >> need to take the same approach between the two, and that is why >> we need "what is the plan to add new pieces of information?" >> answered. > I am sorry if I failed to explain you what I really wanted to mean. > Yes, I forgot to answer the last question which is "When we > want to add new kinds of information other than the URL and the > list-objects-filter, what is our plan to add them?". > > So let me answer this now. As `-v` flag gives a kind of overall summary > of the remotes, users expect that the most important and most basic > information should be listed in the output of `remote -v`. So, there > may be some other more important informations in the future that we > have to add to `remote -v` output. In that case, method (1) would not > be a great idea I think (unless a new flag has been created). method > (2) is better. When I use the `git remote` command, I use the -vv variant to what what I need, i.e. its more than `-v`, so maybe adding an extra `--show-partial-filter` option may be necessary (with a more compact name ;-). There will also be a similar desire (IIUC) to match the sparse/cone mode repos to their remotes, i.e. to remind a user that what is held at the remote isn't the same as held locally. > >>> (i.e. which are promisor remotes and which are not) one by one. If the >>> user try to search for the promisor remotes in the config file, he/she >>> have to go through the other configuration settings (irrelevant to him/her >>> at that time) to reach the `[remote]` section. Isn't it? >> Sorry, but the question does not make much sense to me. Why is a >> piece of information you get from "git config" irrelevant if you get >> it in order to figure out what you want to know, i.e. what promisor >> remote do we rely on? > Let me explain what I really meant here - I am guessing that you have no > problem with the upper part of that para. > > If we forget about my proposed change, there are two possible ways to find > out the info about promisor remotes - > (1) Use `git config --get remote.<remote-name>.partialCloneFilter` > > This command gives an output only if <remote-name> is a promisor > remote. So in case the user forget which one is a promisor > remote, he/she has to try this command with each and every > <remote-name> to find out which is/are the promisor remote(s). I hear your pain here. I had the same issue with the branch description. (https://stackoverflow.com/questions/15058844/print-branch-description). It's the same 'extract from config' problem. ```You can display the branch description using a git config command. To show all branch descriptions, I have the alias brshow = config --get-regexp 'branch.*.description' , and for the current HEAD I have brshow1 = !git config --get "branch.$(git rev-parse --abbrev-ref HEAD).description". ``` so it is possible to generalise the config query, if hard to discover. <https://stackoverflow.com/a/15062356/717355> > > (2) Open the git config file (either manually or by running `git > config --edit` > > In this case, the user has to go through all the settings until > the [remote "<remote-name>"] section is found. E.g. let's say > below is the config file - > > [core] > repositoryformatversion = 0 > filemode = true > bare = false > logallrefupdates = true > ignorecase = true > precomposeunicode = true > [remote "upstream"] > url = https://github.com/git/git.git > fetch = +refs/heads/*:refs/remotes/upstream/* > [branch "master"] > remote = upstream > merge = refs/heads/master > [remote "origin"] > url = https://github.com/Abhra303/git.git > fetch = +refs/heads/*:refs/remotes/origin/* > partialCloneFilter = blob:none > > To find out whether "origin" is promisor or not, he has to go > to the [remote "origin"] section. Here all the configuations > under `[core]`, `[remote "upstream"]` and `[branch "master"] > are irrelevant to him/her at that time (because he/she is not > interested to know about those configuration settings at that > time). > > The proposed change is simpler compared to the above as it lists down all > the remotes along with their list-objects-filter. Another point is that > it's important for an user to know which one is a promisor remote and what > filter type they use. If we go with the current implementation the output > would be let's say - > origin <remote-url> (fetch) > origin <remote-url> (push) > upstream <remote-url> (fetch) > upstream <remote-url> (push) > > By seeing the above output anyone may assume that all the remotes are > normal remotes. If the user now try to run `git pull origin` and suddenly > he/she discover that some blobs are not downloaded. He/she run the above > mentioned (1) command and find that this is a promisor remote! > > Here `remote -v` didn't warn the user about the origin remote being an > promisor remote. Instead it makes him/her assume that all are normal > remotes. Providing only these three info (i.e. <remote-name>, <remote-url> > and <direction>) is not sufficient - it only shows the half of the picture. > > >> The question is "what can readers achieve by having this extra >> information in 'remote -v' output". Do you have to duck the >> question because you cannot answer in a simple sentence, and instead >> readers must read reams of documentation pages? I doubt it would be >> that obscure. > Sorry, I misunderstood that section of your first comment. I think > I hopefully answered this question in the above portion of this comment. > Providing only the three information about remotes is not sufficient > as it do not distinguish between promisor remotes and normal remotes. > In that sense, it will add simplicity and the user would be much more > clear about the remotes(i.e. which is promisor remote and which is not). > >> I wanted to like the patch, the changed text is simple enough, but >> quite honestly, the lack of clarity in the answers to the most basic >> "why do we want this? what is this good for? how does this help the >> users?" questions, I am not yet succeeding to do so. > My bad! Hope I am now able to answer all the questions you asked. Let > me know if you still struggle to get my point. > > Thanks :)
Philip Oakley <philipoakley@iee.email> wrote: > When I use the `git remote` command, I use the -vv variant to what what > I need, i.e. its more than `-v`, so maybe adding an extra > `--show-partial-filter` option may be necessary (with a more compact > name ;-). If adding new informations to `-v` is not possible (to avoid messy output), atleast including it to `-vv` makes sense to me (though I am not sure if `git remote -vv` is currently implemented). > There will also be a similar desire (IIUC) to match the sparse/cone mode > repos to their remotes, i.e. to remind a user that what is held at the > remote isn't the same as held locally. Yeah, maybe. > I hear your pain here. I had the same issue with the branch description. > (https://stackoverflow.com/questions/15058844/print-branch-description). > It's the same 'extract from config' problem. > > ```You can display the branch description using a git config command. > > To show all branch descriptions, I have the alias > > brshow = config --get-regexp 'branch.*.description' > > , and for the current HEAD I have > > brshow1 = !git config --get "branch.$(git rev-parse --abbrev-ref > HEAD).description". ``` > > so it is possible to generalise the config query, if hard to discover. > <https://stackoverflow.com/a/15062356/717355> Thanks for the info. I tried your suggestion and it worked. But still, it is better to include <list-object-filter> in the output. This is because of the second point I mentioned in my previous comment. Users can be much more clear about the types of available remotes overall. IMO specifying filters for remotes is far more important than the branch description. The behaviour of `git fetch` depends on it. If we can specify `(fetch)` in the output then why not the filter of that `fetch` on which the behaviour of `fetch` functionality highly depends? Thanks :)
diff --git a/builtin/remote.c b/builtin/remote.c index 5f4cde9d784..95e28b534f4 100644 --- a/builtin/remote.c +++ b/builtin/remote.c @@ -1190,7 +1190,15 @@ static int get_one_entry(struct remote *remote, void *priv) int i, url_nr; if (remote->url_nr > 0) { + struct strbuf promisor_config = STRBUF_INIT; + const char *partial_clone_filter = NULL; + + strbuf_addf(&promisor_config, "remote.%s.partialclonefilter", remote->name); strbuf_addf(&url_buf, "%s (fetch)", remote->url[0]); + if (!git_config_get_string_tmp(promisor_config.buf, &partial_clone_filter)) + strbuf_addf(&url_buf, " [%s]", partial_clone_filter); + + strbuf_release(&promisor_config); string_list_append(list, remote->name)->util = strbuf_detach(&url_buf, NULL); } else diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 4a3778d04a8..bf8f3644d3c 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -49,6 +49,17 @@ test_expect_success 'do partial clone 1' ' test "$(git -C pc1 config --local remote.origin.partialclonefilter)" = "blob:none" ' +test_expect_success 'filters for promisor remotes is listed by git remote -v' ' + git clone --filter=blob:none "file://$(pwd)/srv.bare" pc2 && + git -C pc2 remote -v >out && + grep "[blob:none]" out && + + git -C pc2 config remote.origin.partialCloneFilter object:type=commit && + git -C pc2 remote -v >out && + grep "[object:type=commit]" out && + rm -rf pc2 +' + test_expect_success 'verify that .promisor file contains refs fetched' ' ls pc1/.git/objects/pack/pack-*.promisor >promisorlist && test_line_count = 1 promisorlist &&