mbox series

[v3,0/2] check-attr: add support to work with revisions

Message ID 20221216093552.3171319-1-karthik.188@gmail.com (mailing list archive)
Headers show
Series check-attr: add support to work with revisions | expand

Message

karthik nayak Dec. 16, 2022, 9:35 a.m. UTC
v1: https://lore.kernel.org/git/20221206103736.53909-1-karthik.188@gmail.com/
v2: https://lore.kernel.org/git/CAOLa=ZSsFGBw3ta1jWN8cmUch2ca=zTEjp1xMA6Linafx9W53g@mail.gmail.com/T/#t

Given a pathname, git-check-attr(1) will list the attributes which apply to that
pathname by reading all relevant gitattributes files. Currently there is no way
to specify a revision to read the gitattributes from.

This is specifically useful in bare repositories wherein the gitattributes are
only present in the git working tree but not available directly on the
filesystem.

This series aims to add a new flag `-r|--revisions` to git-check-attr(1) which
allows us to read gitattributes from the specified revision.

Changes since version 2:
- Changes to the commit message [1/2] to use more specific terms and to
  be more descriptive.
- Moved the flag's position in the documentation to be before the unbound
  list of non-options.

Range-diff against v2:

1:  2e71cbbddd < -:  ---------- Git 2.39-rc2
-:  ---------- > 1:  57e2c6ebbe Start the 2.40 cycle
2:  898041f243 = 2:  c386de2d42 t0003: move setup for `--all` into new block
3:  12a72e09e0 ! 3:  b93a68b0c9 attr: add flag `-r|--revisions` to work with revisions
    @@ Metadata
      ## Commit message ##
         attr: add flag `-r|--revisions` to work with revisions
     
    -    Git check-attr currently doesn't check the git worktree, it either
    -    checks the index or the files directly. This means we cannot check the
    -    attributes for a file against a certain revision.
    +    The contents of the .gitattributes files may evolve over time, but "git
    +    check-attr" always checks attributes against them in the working tree
    +    and/or in the index. It may be beneficial to optionally allow the users
    +    to check attributes against paths from older commits.
     
    -    Add a new flag `--revision`/`-r` which will allow it work with
    -    revisions. This command will now, instead of checking the files/index,
    -    try and receive the blob for the given attribute file against the
    -    provided revision. The flag overrides checking against the index and
    -    filesystem and also works with bare repositories.
    +    Add a new flag `--revision`/`-r` which will allow users to check the
    +    attributes against a tree-ish revision. When the user uses this flag, we
    +    go through the stack of .gitattributes files but instead of checking the
    +    current working tree and/or in the index, we check the blobs from the
    +    provided tree-ish object. This allows the command to also be used in
    +    bare repositories.
    +
    +    Since we use a tree-ish object, the user can pass "-r HEAD:subdirectory"
    +    and all the attributes will be looked up as if subdirectory was the root
    +    directory of the repository.
     
         We cannot use the `<rev>:<path>` syntax like the one used in `git show`
         because any non-flag parameter before `--` is treated as an attribute
         and any parameter after `--` is treated as a pathname.
     
    -    This involves creating a new function `read_attr_from_blob`, which given
    -    the path reads the blob for the path against the provided revision and
    +    The change involves creating a new function `read_attr_from_blob`, which
    +    given the path reads the blob for the path against the provided revision and
         parses the attributes line by line. This function is plugged into
    -    `read_attr()` function wherein we go through the different attributes.
    +    `read_attr()` function wherein we go through the stack of attributes
    +    files.
     
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
         Co-authored-by: toon@iotcl.com
    @@ Documentation/git-check-attr.txt: git-check-attr - Display gitattributes informa
      [verse]
     -'git check-attr' [-a | --all | <attr>...] [--] <pathname>...
     -'git check-attr' --stdin [-z] [-a | --all | <attr>...]
    -+'git check-attr' [-a | --all | <attr>...] [-r <revision>] [--] <pathname>...
    -+'git check-attr' --stdin [-z] [-a | --all | <attr>...] [-r <revision>]
    ++'git check-attr' [-r <revision>] [-a | --all | <attr>...] [--] <pathname>...
    ++'git check-attr' --stdin [-z] [-r <revision>] [-a | --all | <attr>...]
      
      DESCRIPTION
      -----------
    @@ Documentation/git-check-attr.txt: OPTIONS
      
     +--r <revision>::
     +--revision=<revision>::
    -+	Check attributes against the specified revision.
    ++	Check attributes against the specified tree-ish revision. All the
    ++	attributes will be checked against the provided revision. Paths provided
    ++	as part of the revision will be treated as the root directory.
     +
      \--::
      	Interpret all preceding arguments as attributes and all following
    @@ builtin/check-attr.c
      static const char * const check_attr_usage[] = {
     -N_("git check-attr [-a | --all | <attr>...] [--] <pathname>..."),
     -N_("git check-attr --stdin [-z] [-a | --all | <attr>...]"),
    -+N_("git check-attr [-a | --all | <attr>...] [-r <revision>] [--] <pathname>..."),
    -+N_("git check-attr --stdin [-z] [-a | --all | <attr>...] [-r <revision>]"),
    ++N_("git check-attr [-r <revision>] [-a | --all | <attr>...] [--] <pathname>..."),
    ++N_("git check-attr --stdin [-z] [-r <revision>] [-a | --all | <attr>...]"),
      NULL
      };
      


Karthik Nayak (2):
  t0003: move setup for `--all` into new block
  attr: add flag `-r|--revisions` to work with revisions

 Documentation/git-check-attr.txt |  10 +++-
 archive.c                        |   2 +-
 attr.c                           | 100 ++++++++++++++++++++++---------
 attr.h                           |   7 ++-
 builtin/check-attr.c             |  33 ++++++----
 builtin/pack-objects.c           |   2 +-
 convert.c                        |   2 +-
 ll-merge.c                       |   4 +-
 pathspec.c                       |   2 +-
 t/t0003-attributes.sh            |  71 +++++++++++++++++++++-
 userdiff.c                       |   2 +-
 ws.c                             |   2 +-
 12 files changed, 182 insertions(+), 55 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Dec. 16, 2022, 4:17 p.m. UTC | #1
On Fri, Dec 16 2022, Karthik Nayak wrote:

> v1: https://lore.kernel.org/git/20221206103736.53909-1-karthik.188@gmail.com/
> v2: https://lore.kernel.org/git/CAOLa=ZSsFGBw3ta1jWN8cmUch2ca=zTEjp1xMA6Linafx9W53g@mail.gmail.com/T/#t

Could you please set the In-Reply-To header appropriately in the future,
so that each version of this series isn't in its own disconnected
thread?

> This series aims to add a new flag `-r|--revisions` to git-check-attr(1) which
> allows us to read gitattributes from the specified revision.

I didn't look at the v2, but expected at least the short form to be gone
here re
https://lore.kernel.org/git/CAOLa=ZTSzUh2Ma_EMHHWcDunGyKMaUW9BaG=QdegtMqLd+69Wg@mail.gmail.com/;

I'm still more partial to the alternate suggestion I had in
https://lore.kernel.org/git/221207.86lenja0zi.gmgdl@evledraar.gmail.com/;
I'm not sure what you meant in your reply at
https://lore.kernel.org/git/CAOLa=ZQua8TfApCdzoK06_2fkWb4ZCfWewXKOSaXno1fqFSq2A@mail.gmail.com/
(sorry about not following up at the time) with:

	"when being consistent we need to be fully consistent,
	i.e. <revision>:<path>, tweaking this slightly to be
	<revision>:<attr> is worse than breaking consistency."

Yes, it would, but isn't that by definition the case with any
proposal?

We don't have a way to refer to an attribute (or all attributes for -a)
for a given revision/path, the task of this series is to invent such a
syntax.

So we could invent that as this series currently does with:

	git check-attrs --revision <rev> <attr>... <path>...

Or, as I suggested:

        git check-attr [<rev>:]<attr>... -- <path>...

Or whatever. Here I'm not saying that one is better than the other, but
advocating for one on the basis of consistency doesn't make sense to me,
this is new syntax.

I think what you mean is that because the log family uses "<rev>:<path>"
we should not come up with a syntax that looks anything like
"<lhs>:<rhs>"., as the "<lhs>" in the mind of some users is going to be
"<rev>", and the "<rhs>" is "<path>", so it would be confusing to have
it be "<attr>" here, and have the "<path>..." come after the "--".

I'm not convinced by that. From refspecs to e.g. "git log"'s own "-L" we
have little mini-syntaxes in various places that use this sort of colon
notation. I find it more elegant than "--revision".

It's fine if you disagree, I'm just trying to understand the basis of
the disagreement.
Junio C Hamano Dec. 16, 2022, 10:38 p.m. UTC | #2
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> So we could invent that as this series currently does with:
>
> 	git check-attrs --revision <rev> <attr>... <path>...
>
> Or, as I suggested:
>
>         git check-attr [<rev>:]<attr>... -- <path>...

What does <rev>:<attr> really mean?  As the syntax for the proposed
feature, I do not think it makes much sense.  For example:

  $ git check-attr HEAD:text HEAD^:text -- README.txt

 - With which README.txt are we checking the attribute?  The one
   taken from HEAD or HEAD^ or the index or the working tree?

 - When we say "README.txt has the text attribute", how does the
   user tell which "text" applies to the path?  From HEAD?  From
   HEAD^?

 - Does the same attribute 'text' have different meaning when coming
   from two different tree-ish?

Compared to that at least the proposed one makes it fairly clear
that we are talking about things in a single tree-ish consistently.
Junio C Hamano Dec. 16, 2022, 11:26 p.m. UTC | #3
Karthik Nayak <karthik.188@gmail.com> writes:

> Changes since version 2:
> - Changes to the commit message [1/2] to use more specific terms and to
>   be more descriptive.
> - Moved the flag's position in the documentation to be before the unbound
>   list of non-options.
>
> Range-diff against v2:
>
> 1:  2e71cbbddd < -:  ---------- Git 2.39-rc2
> -:  ---------- > 1:  57e2c6ebbe Start the 2.40 cycle

Does this new iteration use something that was added between these
two bases?  Asking because the choice of new base is questionable.
I would understand it if the rebase were on top of v2.39.0 tag,
though.

 * If the updated series depends on new APIs and features added
   since the old base, do rebase on the new one to take advantage of
   them.

 * A bugfix patch series may want to avoid using the newest and
   greatest if it allows the series to be applied to the older
   maintenance track, and keeping the older base may make more
   sense.

 * If a series based on an older base no longer merges cleanly to
   'master' and/or 'next', but rebasing on a newer base makes it
   merge cleanly, do rebase.

 * Otherwise, keeping the same base is preferred.

When rebasing is appropriate, choosing a well-known base (e.g. a
tagged release) helps others.
Junio C Hamano Dec. 16, 2022, 11:28 p.m. UTC | #4
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> This series aims to add a new flag `-r|--revisions` to git-check-attr(1) which
>> allows us to read gitattributes from the specified revision.
>
> I didn't look at the v2, but expected at least the short form to be gone
> here re
> https://lore.kernel.org/git/CAOLa=ZTSzUh2Ma_EMHHWcDunGyKMaUW9BaG=QdegtMqLd+69Wg@mail.gmail.com/;

It was unexpected to me, too.  Thanks for pointing it out.
Phillip Wood Dec. 17, 2022, 10:53 a.m. UTC | #5
Hi Karthik

On 16/12/2022 09:35, Karthik Nayak wrote:
> v1: https://lore.kernel.org/git/20221206103736.53909-1-karthik.188@gmail.com/
> v2: https://lore.kernel.org/git/CAOLa=ZSsFGBw3ta1jWN8cmUch2ca=zTEjp1xMA6Linafx9W53g@mail.gmail.com/T/#t
> 
> Given a pathname, git-check-attr(1) will list the attributes which apply to that
> pathname by reading all relevant gitattributes files. Currently there is no way
> to specify a revision to read the gitattributes from.
> 
> This is specifically useful in bare repositories wherein the gitattributes are
> only present in the git working tree but not available directly on the
> filesystem.

I was thinking about this and wondering if the problem is really that 
bare repositories ignore attributes because they don't have a working 
copy. If that's the case then we should perhaps be looking to fix that 
so that all git commands such as diff as log benefit rather than just 
adding a flag to check-attr. A simple solution would be to read the 
attributes from HEAD in a bare repository in the same way that we 
fallback to the index if there are no attributes in the working copy for 
non-bare repositories.

Best Wishes

Phillip

> This series aims to add a new flag `-r|--revisions` to git-check-attr(1) which
> allows us to read gitattributes from the specified revision.
> 
> Changes since version 2:
> - Changes to the commit message [1/2] to use more specific terms and to
>    be more descriptive.
> - Moved the flag's position in the documentation to be before the unbound
>    list of non-options.
> 
> Range-diff against v2:
> 
> 1:  2e71cbbddd < -:  ---------- Git 2.39-rc2
> -:  ---------- > 1:  57e2c6ebbe Start the 2.40 cycle
> 2:  898041f243 = 2:  c386de2d42 t0003: move setup for `--all` into new block
> 3:  12a72e09e0 ! 3:  b93a68b0c9 attr: add flag `-r|--revisions` to work with revisions
>      @@ Metadata
>        ## Commit message ##
>           attr: add flag `-r|--revisions` to work with revisions
>       
>      -    Git check-attr currently doesn't check the git worktree, it either
>      -    checks the index or the files directly. This means we cannot check the
>      -    attributes for a file against a certain revision.
>      +    The contents of the .gitattributes files may evolve over time, but "git
>      +    check-attr" always checks attributes against them in the working tree
>      +    and/or in the index. It may be beneficial to optionally allow the users
>      +    to check attributes against paths from older commits.
>       
>      -    Add a new flag `--revision`/`-r` which will allow it work with
>      -    revisions. This command will now, instead of checking the files/index,
>      -    try and receive the blob for the given attribute file against the
>      -    provided revision. The flag overrides checking against the index and
>      -    filesystem and also works with bare repositories.
>      +    Add a new flag `--revision`/`-r` which will allow users to check the
>      +    attributes against a tree-ish revision. When the user uses this flag, we
>      +    go through the stack of .gitattributes files but instead of checking the
>      +    current working tree and/or in the index, we check the blobs from the
>      +    provided tree-ish object. This allows the command to also be used in
>      +    bare repositories.
>      +
>      +    Since we use a tree-ish object, the user can pass "-r HEAD:subdirectory"
>      +    and all the attributes will be looked up as if subdirectory was the root
>      +    directory of the repository.
>       
>           We cannot use the `<rev>:<path>` syntax like the one used in `git show`
>           because any non-flag parameter before `--` is treated as an attribute
>           and any parameter after `--` is treated as a pathname.
>       
>      -    This involves creating a new function `read_attr_from_blob`, which given
>      -    the path reads the blob for the path against the provided revision and
>      +    The change involves creating a new function `read_attr_from_blob`, which
>      +    given the path reads the blob for the path against the provided revision and
>           parses the attributes line by line. This function is plugged into
>      -    `read_attr()` function wherein we go through the different attributes.
>      +    `read_attr()` function wherein we go through the stack of attributes
>      +    files.
>       
>           Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
>           Co-authored-by: toon@iotcl.com
>      @@ Documentation/git-check-attr.txt: git-check-attr - Display gitattributes informa
>        [verse]
>       -'git check-attr' [-a | --all | <attr>...] [--] <pathname>...
>       -'git check-attr' --stdin [-z] [-a | --all | <attr>...]
>      -+'git check-attr' [-a | --all | <attr>...] [-r <revision>] [--] <pathname>...
>      -+'git check-attr' --stdin [-z] [-a | --all | <attr>...] [-r <revision>]
>      ++'git check-attr' [-r <revision>] [-a | --all | <attr>...] [--] <pathname>...
>      ++'git check-attr' --stdin [-z] [-r <revision>] [-a | --all | <attr>...]
>        
>        DESCRIPTION
>        -----------
>      @@ Documentation/git-check-attr.txt: OPTIONS
>        
>       +--r <revision>::
>       +--revision=<revision>::
>      -+	Check attributes against the specified revision.
>      ++	Check attributes against the specified tree-ish revision. All the
>      ++	attributes will be checked against the provided revision. Paths provided
>      ++	as part of the revision will be treated as the root directory.
>       +
>        \--::
>        	Interpret all preceding arguments as attributes and all following
>      @@ builtin/check-attr.c
>        static const char * const check_attr_usage[] = {
>       -N_("git check-attr [-a | --all | <attr>...] [--] <pathname>..."),
>       -N_("git check-attr --stdin [-z] [-a | --all | <attr>...]"),
>      -+N_("git check-attr [-a | --all | <attr>...] [-r <revision>] [--] <pathname>..."),
>      -+N_("git check-attr --stdin [-z] [-a | --all | <attr>...] [-r <revision>]"),
>      ++N_("git check-attr [-r <revision>] [-a | --all | <attr>...] [--] <pathname>..."),
>      ++N_("git check-attr --stdin [-z] [-r <revision>] [-a | --all | <attr>...]"),
>        NULL
>        };
>        
> 
> 
> Karthik Nayak (2):
>    t0003: move setup for `--all` into new block
>    attr: add flag `-r|--revisions` to work with revisions
> 
>   Documentation/git-check-attr.txt |  10 +++-
>   archive.c                        |   2 +-
>   attr.c                           | 100 ++++++++++++++++++++++---------
>   attr.h                           |   7 ++-
>   builtin/check-attr.c             |  33 ++++++----
>   builtin/pack-objects.c           |   2 +-
>   convert.c                        |   2 +-
>   ll-merge.c                       |   4 +-
>   pathspec.c                       |   2 +-
>   t/t0003-attributes.sh            |  71 +++++++++++++++++++++-
>   userdiff.c                       |   2 +-
>   ws.c                             |   2 +-
>   12 files changed, 182 insertions(+), 55 deletions(-)
>
karthik nayak Dec. 17, 2022, 2:46 p.m. UTC | #6
On Fri, Dec 16, 2022 at 5:30 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Fri, Dec 16 2022, Karthik Nayak wrote:
>
> > v1: https://lore.kernel.org/git/20221206103736.53909-1-karthik.188@gmail.com/
> > v2: https://lore.kernel.org/git/CAOLa=ZSsFGBw3ta1jWN8cmUch2ca=zTEjp1xMA6Linafx9W53g@mail.gmail.com/T/#t
>
> Could you please set the In-Reply-To header appropriately in the future,
> so that each version of this series isn't in its own disconnected
> thread?
>

I didn't know, will do this from next time!

> > This series aims to add a new flag `-r|--revisions` to git-check-attr(1) which
> > allows us to read gitattributes from the specified revision.
>
> I didn't look at the v2, but expected at least the short form to be gone
> here re
> https://lore.kernel.org/git/CAOLa=ZTSzUh2Ma_EMHHWcDunGyKMaUW9BaG=QdegtMqLd+69Wg@mail.gmail.com/;
>

Right, I was open to it, but since there wasn't any confirmation, I
didn't go forward with it.
Will remove it from the next version.

> I'm still more partial to the alternate suggestion I had in
> https://lore.kernel.org/git/221207.86lenja0zi.gmgdl@evledraar.gmail.com/;
> I'm not sure what you meant in your reply at
> https://lore.kernel.org/git/CAOLa=ZQua8TfApCdzoK06_2fkWb4ZCfWewXKOSaXno1fqFSq2A@mail.gmail.com/
> (sorry about not following up at the time) with:
>
>         "when being consistent we need to be fully consistent,
>         i.e. <revision>:<path>, tweaking this slightly to be
>         <revision>:<attr> is worse than breaking consistency."
>
> Yes, it would, but isn't that by definition the case with any
> proposal?
>

I'm not opposing the proposal, rather stating my opinion on it. To go
over my reply

I'm only saying that most users of Git are accustomed to the `<revision>:<path>`
syntax and now breaking that only in one command to be `<revision>:<attr>` seems
a bit odd, from the user experience point of view.

> We don't have a way to refer to an attribute (or all attributes for -a)
> for a given revision/path, the task of this series is to invent such a
> syntax.
>
> So we could invent that as this series currently does with:
>
>         git check-attrs --revision <rev> <attr>... <path>...
>
> Or, as I suggested:
>
>         git check-attr [<rev>:]<attr>... -- <path>...
>
> Or whatever. Here I'm not saying that one is better than the other, but
> advocating for one on the basis of consistency doesn't make sense to me,
> this is new syntax.
>

I see what you mean, but I was referring to consistency around how different
options are used in other git commands.

Mainly that most commands treat the second section after `<rev>:` to
be a path, now
adding a new option where the section after `<rev>:` to be an
attribute, might be a
bit confusing.

> I think what you mean is that because the log family uses "<rev>:<path>"
> we should not come up with a syntax that looks anything like
> "<lhs>:<rhs>"., as the "<lhs>" in the mind of some users is going to be
> "<rev>", and the "<rhs>" is "<path>", so it would be confusing to have
> it be "<attr>" here, and have the "<path>..." come after the "--".
>

Exactly.

> I'm not convinced by that. From refspecs to e.g. "git log"'s own "-L" we
> have little mini-syntaxes in various places that use this sort of colon
> notation. I find it more elegant than "--revision".
>
> It's fine if you disagree, I'm just trying to understand the basis of
> the disagreement.
>

I don't disagree. I think it's healthy to have this discussion,
especially since we're
adding a new option and this is the right time. I'm all ears and finally want to
get the best solution.
karthik nayak Dec. 17, 2022, 2:49 p.m. UTC | #7
On Sat, Dec 17, 2022 at 12:26 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Karthik Nayak <karthik.188@gmail.com> writes:
>
> > Changes since version 2:
> > - Changes to the commit message [1/2] to use more specific terms and to
> >   be more descriptive.
> > - Moved the flag's position in the documentation to be before the unbound
> >   list of non-options.
> >
> > Range-diff against v2:
> >
> > 1:  2e71cbbddd < -:  ---------- Git 2.39-rc2
> > -:  ---------- > 1:  57e2c6ebbe Start the 2.40 cycle
>
> Does this new iteration use something that was added between these
> two bases?  Asking because the choice of new base is questionable.
> I would understand it if the rebase were on top of v2.39.0 tag,
> though.
>
>  * If the updated series depends on new APIs and features added
>    since the old base, do rebase on the new one to take advantage of
>    them.
>
>  * A bugfix patch series may want to avoid using the newest and
>    greatest if it allows the series to be applied to the older
>    maintenance track, and keeping the older base may make more
>    sense.
>
>  * If a series based on an older base no longer merges cleanly to
>    'master' and/or 'next', but rebasing on a newer base makes it
>    merge cleanly, do rebase.
>
>  * Otherwise, keeping the same base is preferred.
>
> When rebasing is appropriate, choosing a well-known base (e.g. a
> tagged release) helps others.

Right! I think I just have a habit of rebasing on top of master on a general
basis. I'll keep the old base and modify my tags to make sure the next
range-diff
will use `2e71cbbddd` as the base for both ranges.
karthik nayak Dec. 17, 2022, 2:52 p.m. UTC | #8
On Sat, Dec 17, 2022 at 11:53 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> > This is specifically useful in bare repositories wherein the gitattributes are
> > only present in the git working tree but not available directly on the
> > filesystem.
>
> I was thinking about this and wondering if the problem is really that
> bare repositories ignore attributes because they don't have a working
> copy. If that's the case then we should perhaps be looking to fix that
> so that all git commands such as diff as log benefit rather than just
> adding a flag to check-attr. A simple solution would be to read the
> attributes from HEAD in a bare repository in the same way that we
> fallback to the index if there are no attributes in the working copy for
> non-bare repositories.
>

This is actually the direction I started this series in, but I soon
realized it's also useful
to have a more generic version (which is currently what we have in
this patch series)
which also satisfies the bare repository scenario. It seemed like a
natural extension.

I thought it's also useful because it lets you see how attributes
changes over history.
Ævar Arnfjörð Bjarmason Dec. 19, 2022, 8:45 a.m. UTC | #9
On Sat, Dec 17 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> So we could invent that as this series currently does with:
>>
>> 	git check-attrs --revision <rev> <attr>... <path>...
>>
>> Or, as I suggested:
>>
>>         git check-attr [<rev>:]<attr>... -- <path>...
>
> What does <rev>:<attr> really mean?  As the syntax for the proposed
> feature, I do not think it makes much sense.  For example:
>
>   $ git check-attr HEAD:text HEAD^:text -- README.txt
>
>  - With which README.txt are we checking the attribute?  The one
>    taken from HEAD or HEAD^ or the index or the working tree?

All of them, but I do think this rightly points out that the "rev before
path" part of this doesn't make sense, but shouldn't we be making this
work like "git grep" with <rev>/<path> combinations? I.e.:
	
	$ git -P grep -m 1 oid HEAD~:cache.h v2.26.0:cache.h v1.6.0:cache.h
	HEAD~:cache.h:#include "oid-array.h"
	v2.26.0:cache.h:void git_inflate_init(git_zstream *);
	v1.6.0:cache.h:static inline void copy_cache_entry(struct cache_entry *dst, struct cache_entry *src)

I.e. we currently support:

	git check-attr [-a | --all | <attr>...] [--] <pathname>...
	git check-attr --stdin [-z] [-a | --all | <attr>...]

So if we add to that:

	git check-attr --stdin [-z] <rev>:<pathname>...

We'd have this do the right thing:
	
	$ git check-attr diff -- README.md HEAD:git-send-email.perl v1.6.0:git-send-email.perl
	README.md: diff: unspecified
	HEAD:git-send-email.perl: diff: perl
	v1.6.0:git-send-email.perl: diff: perl

Which would technically break backwards compatibility, as we now
"support" it (we just interpret the whole thing as a path), but I think
such revision-looking paths aren't worth worrying about

>  - When we say "README.txt has the text attribute", how does the
>    user tell which "text" applies to the path?  From HEAD?  From
>    HEAD^?

Regardless of what I'm suggesting here, the "git check-attr" output
already has a one-to-one line output correspondance with its input, so
just as it does now we'd print both.

This looks like a bug though (on master, the missing "\n" is there in
the output):

	$ ./git check-attr diffgit-send-email.perl foo.perl git-send-email.perl
	foo.perl: diffgit-send-email.perl: unspecified
	git-send-email.perl: diffgit-send-email.perl: unspecified

>  - Does the same attribute 'text' have different meaning when coming
>    from two different tree-ish?

Yes, just like "git grep", we'd need to parse & apply the .gitattributes
for that revision. Whether we call it "<rev>:<path>", "--revision <rev>
<path>" or whatever we'd always want to do that, otherwise what's the
point?
Ævar Arnfjörð Bjarmason Dec. 19, 2022, 9:45 a.m. UTC | #10
On Fri, Dec 16 2022, Karthik Nayak wrote:

> v1: https://lore.kernel.org/git/20221206103736.53909-1-karthik.188@gmail.com/
> v2: https://lore.kernel.org/git/CAOLa=ZSsFGBw3ta1jWN8cmUch2ca=zTEjp1xMA6Linafx9W53g@mail.gmail.com/T/#t
>
> Given a pathname, git-check-attr(1) will list the attributes which apply to that
> pathname by reading all relevant gitattributes files. Currently there is no way
> to specify a revision to read the gitattributes from.
>
> This is specifically useful in bare repositories wherein the gitattributes are
> only present in the git working tree but not available directly on the
> filesystem.
>
> This series aims to add a new flag `-r|--revisions` to git-check-attr(1) which
> allows us to read gitattributes from the specified revision.
>
> Changes since version 2:
> - Changes to the commit message [1/2] to use more specific terms and to
>   be more descriptive.
> - Moved the flag's position in the documentation to be before the unbound
>   list of non-options.

Aside from the UX concerns with this series, this segfaults with it, but
not on "master":
	
	./git check-attr diff git-send-email.perl foo.perl git-send-email.perl
	AddressSanitizer:DEADLYSIGNAL
	=================================================================
	==1828755==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x0000008ee4a8 bp 0x7fffe4cef820 sp 0x7fffe4cef800 T0)
	==1828755==The signal is caused by a READ memory access.
	==1828755==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
	    #0 0x8ee4a8 in hasheq_algop hash.h:236
	    #1 0x8ee632 in oideq hash.h:253
	    #2 0x8ee657 in is_null_oid hash.h:258
	    #3 0x8f79e2 in do_oid_object_info_extended object-file.c:1550
	    #4 0x8f8206 in oid_object_info_extended object-file.c:1640
	    #5 0x8f860c in read_object object-file.c:1672
	    #6 0x8f8a8a in read_object_file_extended object-file.c:1715
	    #7 0x8f01ef in repo_read_object_file object-store.h:253
	    #8 0x8f8e37 in read_object_with_reference object-file.c:1756
	    #9 0xafb411 in get_tree_entry tree-walk.c:612
	    #10 0x6d1975 in read_attr_from_blob attr.c:776
	    #11 0x6d1b80 in read_attr attr.c:826
	    #12 0x6d1f35 in bootstrap_attr_stack attr.c:912
	    #13 0x6d2173 in prepare_attr_stack attr.c:948
	    #14 0x6d3285 in collect_some_attrs attr.c:1143
	    #15 0x6d33e1 in git_check_attr attr.c:1157
	    #16 0x453581 in check_attr builtin/check-attr.c:72
	    #17 0x453f1f in cmd_check_attr builtin/check-attr.c:190
	    #18 0x40b63d in run_builtin git.c:466
	    #19 0x40bf7f in handle_builtin git.c:721
	    #20 0x40c686 in run_argv git.c:788
	    #21 0x40d42f in cmd_main git.c:926
	    #22 0x6885b5 in main common-main.c:57
	    #23 0x7f96725a8189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
	    #24 0x7f96725a8244 in __libc_start_main_impl ../csu/libc-start.c:381
	    #25 0x407230 in _start (git+0x407230)
	
	AddressSanitizer can not provide additional info.
	SUMMARY: AddressSanitizer: SEGV hash.h:236 in hasheq_algop
	==1828755==ABORTING
	Aborted

If the tests are still passing for you (I didn't check) then we probably
have a bad test blind spot with kthat we should start by addressing
before adding the new feature.
karthik nayak Dec. 19, 2022, 1:16 p.m. UTC | #11
On Mon, Dec 19, 2022 at 10:46 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Fri, Dec 16 2022, Karthik Nayak wrote:
>
> > v1: https://lore.kernel.org/git/20221206103736.53909-1-karthik.188@gmail.com/
> > v2: https://lore.kernel.org/git/CAOLa=ZSsFGBw3ta1jWN8cmUch2ca=zTEjp1xMA6Linafx9W53g@mail.gmail.com/T/#t
> >
> > Given a pathname, git-check-attr(1) will list the attributes which apply to that
> > pathname by reading all relevant gitattributes files. Currently there is no way
> > to specify a revision to read the gitattributes from.
> >
> > This is specifically useful in bare repositories wherein the gitattributes are
> > only present in the git working tree but not available directly on the
> > filesystem.
> >
> > This series aims to add a new flag `-r|--revisions` to git-check-attr(1) which
> > allows us to read gitattributes from the specified revision.
> >
> > Changes since version 2:
> > - Changes to the commit message [1/2] to use more specific terms and to
> >   be more descriptive.
> > - Moved the flag's position in the documentation to be before the unbound
> >   list of non-options.
>
> Aside from the UX concerns with this series, this segfaults with it, but
> not on "master":
>
>         ./git check-attr diff git-send-email.perl foo.perl git-send-email.perl
>         AddressSanitizer:DEADLYSIGNAL
>         =================================================================
>         ==1828755==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x0000008ee4a8 bp 0x7fffe4cef820 sp 0x7fffe4cef800 T0)
>         ==1828755==The signal is caused by a READ memory access.
>         ==1828755==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
>             #0 0x8ee4a8 in hasheq_algop hash.h:236
>             #1 0x8ee632 in oideq hash.h:253
>             #2 0x8ee657 in is_null_oid hash.h:258
>             #3 0x8f79e2 in do_oid_object_info_extended object-file.c:1550
>             #4 0x8f8206 in oid_object_info_extended object-file.c:1640
>             #5 0x8f860c in read_object object-file.c:1672
>             #6 0x8f8a8a in read_object_file_extended object-file.c:1715
>             #7 0x8f01ef in repo_read_object_file object-store.h:253
>             #8 0x8f8e37 in read_object_with_reference object-file.c:1756
>             #9 0xafb411 in get_tree_entry tree-walk.c:612
>             #10 0x6d1975 in read_attr_from_blob attr.c:776
>             #11 0x6d1b80 in read_attr attr.c:826
>             #12 0x6d1f35 in bootstrap_attr_stack attr.c:912
>             #13 0x6d2173 in prepare_attr_stack attr.c:948
>             #14 0x6d3285 in collect_some_attrs attr.c:1143
>             #15 0x6d33e1 in git_check_attr attr.c:1157
>             #16 0x453581 in check_attr builtin/check-attr.c:72
>             #17 0x453f1f in cmd_check_attr builtin/check-attr.c:190
>             #18 0x40b63d in run_builtin git.c:466
>             #19 0x40bf7f in handle_builtin git.c:721
>             #20 0x40c686 in run_argv git.c:788
>             #21 0x40d42f in cmd_main git.c:926
>             #22 0x6885b5 in main common-main.c:57
>             #23 0x7f96725a8189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>             #24 0x7f96725a8244 in __libc_start_main_impl ../csu/libc-start.c:381
>             #25 0x407230 in _start (git+0x407230)
>
>         AddressSanitizer can not provide additional info.
>         SUMMARY: AddressSanitizer: SEGV hash.h:236 in hasheq_algop
>         ==1828755==ABORTING
>         Aborted
>
> If the tests are still passing for you (I didn't check) then we probably
> have a bad test blind spot with kthat we should start by addressing
> before adding the new feature.

This seems to be what Junio mentioned here:
https://lore.kernel.org/git/xmqqcz8ikgxs.fsf@gitster.g/
Should be fixed in v4!