mbox series

[0/2] diff: support bare repositories when reading gitattributes for diff algorithm

Message ID pull.1459.git.git.1678758818.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series diff: support bare repositories when reading gitattributes for diff algorithm | expand

Message

Christoph Sommer via GitGitGadget March 14, 2023, 1:53 a.m. UTC
This patch series adds support for bare repositories for the feature added
in [1]. When using a bare repository, by default we will look for
gitattributes from HEAD. When the --attr-source option is passed, we will
try to read gitattributes from the commit.

A side effect of this patch series is that custom drivers will now also work
with bare repositories.

 1. (a4cf900ee7 diff: teach diff to read algorithm from diff driver,
    2022-02-20)

John Cai (2):
  diff: use HEAD for attributes when using bare repository
  diff: add --attr-source to read gitattributes from a commit

 Documentation/diff-options.txt  |  4 ++++
 Documentation/gitattributes.txt |  8 +++++++
 diff.c                          | 37 ++++++++++++++++++++++++++++++---
 diff.h                          |  1 +
 t/lib-diff-alternative.sh       | 33 ++++++++++++++++++++++++-----
 t/t4018-diff-funcname.sh        | 29 ++++++++++++++++++++++++++
 userdiff.c                      |  9 +++++++-
 userdiff.h                      |  4 ++++
 8 files changed, 116 insertions(+), 9 deletions(-)


base-commit: d15644fe0226af7ffc874572d968598564a230dd
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1459%2Fjohn-cai%2Fjc%2Fdiff-attr-bare-repo-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1459/john-cai/jc/diff-attr-bare-repo-v1
Pull-Request: https://github.com/git/git/pull/1459

Comments

Philippe Blain March 14, 2023, 5:21 p.m. UTC | #1
Hi John,

Le 2023-03-13 à 21:53, John Cai via GitGitGadget a écrit :
> This patch series adds support for bare repositories for the feature added
> in [1]. When using a bare repository, by default we will look for
> gitattributes from HEAD. When the --attr-source option is passed, we will
> try to read gitattributes from the commit.

When I read that I immediately thought of the config settings 'mailmap.file', 
'mailmap.blob' which allow the same sort of thing, but for the mailmap.

For the sake of consistency of the whole system, maybe we would want to support
'attr.file' and 'attr.blob' ? In that case, maybe we don't need a new command
line option at all...

Cheers,

Philippe.
John Cai March 14, 2023, 7:18 p.m. UTC | #2
On 23/03/14 01:21PM, Philippe Blain wrote:
> Hi John,

Hi Philippe,

> 
> Le 2023-03-13 à 21:53, John Cai via GitGitGadget a écrit :
> > This patch series adds support for bare repositories for the feature added
> > in [1]. When using a bare repository, by default we will look for
> > gitattributes from HEAD. When the --attr-source option is passed, we will
> > try to read gitattributes from the commit.
> 
> When I read that I immediately thought of the config settings 'mailmap.file', 
> 'mailmap.blob' which allow the same sort of thing, but for the mailmap.
> 
> For the sake of consistency of the whole system, maybe we would want to support
> 'attr.file' and 'attr.blob' ? In that case, maybe we don't need a new command
> line option at all...

I wasn't aware of those config options. Indeed that's a good option! My only
concern with that is that there is already some precedence for passing a
<tree-ish> as a source for attributes in [1], so I thought adding a command line
option would be somewhat consistent.

But I can also see the benefit of using a config value since there is also
precedence because of mailmap.file and mailmap.blob. Not sure what others think,
but this may be the less invasive way to go.

1. https://git-scm.com/docs/git-check-attr#Documentation/git-check-attr.txt---sourcelttree-ishgt

> 
> Cheers,
> 
> Philippe.
Junio C Hamano March 14, 2023, 8:44 p.m. UTC | #3
John Cai <johncai86@gmail.com> writes:

> I wasn't aware of those config options. Indeed that's a good option! My only
> concern with that is that there is already some precedence for passing a
> <tree-ish> as a source for attributes in [1], so I thought adding a command line
> option would be somewhat consistent.
>
> But I can also see the benefit of using a config value since there is also
> precedence because of mailmap.file and mailmap.blob. Not sure what others think,
> but this may be the less invasive way to go.

I actually hate these one-off variables, and I wish we didn't do
mailmap.file or mailmap.blob at all.  Instead we could have taught
the machinery to read from $GIT_DIR/info/mailmap just like the
exclude and the attribute machinery already knows to read excludea
and attributes files in the directory.

As your "per filetype (determined by the attribute) diff algorithm
in a bare repository" feature already depends on being able to use
config (as the look-up is "attributes determines the diff filetype
name, and then diff.<filetype>.algo is looked up in the config") to
configure, it does not sound too bad to add the attribute info in
the $GIT_DIR/info/attributes file in your bare repository (and you'd
set the algorithm in $GIT_DIR/config file there), and it would just
work without any new "read from HEAD" feature, I would presume?

Thanks.
Jeff King March 16, 2023, 2:29 p.m. UTC | #4
On Tue, Mar 14, 2023 at 01:44:11PM -0700, Junio C Hamano wrote:

> John Cai <johncai86@gmail.com> writes:
> 
> > I wasn't aware of those config options. Indeed that's a good option! My only
> > concern with that is that there is already some precedence for passing a
> > <tree-ish> as a source for attributes in [1], so I thought adding a command line
> > option would be somewhat consistent.
> >
> > But I can also see the benefit of using a config value since there is also
> > precedence because of mailmap.file and mailmap.blob. Not sure what others think,
> > but this may be the less invasive way to go.
> 
> I actually hate these one-off variables, and I wish we didn't do
> mailmap.file or mailmap.blob at all.  Instead we could have taught
> the machinery to read from $GIT_DIR/info/mailmap just like the
> exclude and the attribute machinery already knows to read excludea
> and attributes files in the directory.

I don't think $GIT_DIR/info/mailmap is a good substitute for
mailmap.blob. The point there is to pull it from the repository, and
you'd have to do:

  git cat-file HEAD:.mailmap >.git/info/mailmap

before each command to get the equivalent. It is even worse for
attributes, because the in-tree representation is not even a single file
that is compatible with $GIT_DIR/ (there can be many attributes spread
throughout the tree).

It's true that mailmap.file, when specified in the local config, is
redundant with a repo-level info/mailmap file. I always assumed that
people put it in their ~/.gitconfig, though, to cover multiple projects
(just like we have core.attributesFile in addition to the in-repo
meta-file).

> As your "per filetype (determined by the attribute) diff algorithm
> in a bare repository" feature already depends on being able to use
> config (as the look-up is "attributes determines the diff filetype
> name, and then diff.<filetype>.algo is looked up in the config") to
> configure, it does not sound too bad to add the attribute info in
> the $GIT_DIR/info/attributes file in your bare repository (and you'd
> set the algorithm in $GIT_DIR/config file there), and it would just
> work without any new "read from HEAD" feature, I would presume?

I don't think that's quite a substitute.

If you have a static list of attributes, that is OK (though you are
presumably better off with /etc/gitattributes or similar that covers all
repos). But if you want to respect in-repo ones, you need to look at the
whole tree (and can't even just dump them out).

It does seem at first glance that you are limited to a static list of
attributes, because any diff "type" you mention has to have a
corresponding entry in the config to do anything useful. But these lists
are only loosely coupled. For example, you could have static entries
like this:

  # in config
  [diff "json"]
  algorithm = patience

  # in attributes
  *.json diff=json

but still tell users that "json" is a well-known name, and if they want
to get the same treatment for an arbitrary file "foo" that they would
get for "foo.json", they have to add attributes mapping it, like:

  foo diff=json

inside their repo.

-Peff
Junio C Hamano March 16, 2023, 4:44 p.m. UTC | #5
Jeff King <peff@peff.net> writes:

> If you have a static list of attributes, that is OK (though you are
> presumably better off with /etc/gitattributes or similar that covers all
> repos).
> ...
> but still tell users that "json" is a well-known name, and if they want
> to get the same treatment for an arbitrary file "foo" that they would
> get for "foo.json", they have to add attributes mapping it, like:
>
>   foo diff=json
>
> inside their repo.

Yes, you described the reason very well why we didn't add "global"
or "system-wide" attribute files.

Thanks for commenting on my somewhat tongue-in-cheek "devil's
advocate" remark.