Message ID | pull.1689.git.1710183362.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | cat-file: add %(objectmode) avoid verifying submodules' OIDs | expand |
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes: > The cat-file --batch command is very valuable in server settings, but so far > it is missing a bit of functionality that would come in handy there. > > For example, it is sometimes necessary to determine the object mode of a > batch of tree objects' children. OK. It is somewhat unsatisfying that --batch/--batch-check lacks so much. Even with %(objectmode) its nature of one-object-at-a-time makes querying children of a large tree a chore, when you compare it with something like "cat-file -p HEAD:" that allows you to grab the needed information for all children with a single invocation. This is orthogonal to what the patch wants to do, which is to enrich the output side with more formatting, bit I wonder if we want to consider enriching the input side? e.g. instead of feeding just a single object name from the standard input of "cat-file --batch/--batch-check", perhaps a syntax can say "Here I have the object name for a tree-ish object, but please pretend that I gave you all the objects contained within it", or something? Thanks, will queue.
On Mon, Mar 11, 2024 at 02:43:00PM -0700, Junio C Hamano wrote: > It is somewhat unsatisfying that --batch/--batch-check lacks so > much. Even with %(objectmode) its nature of one-object-at-a-time > makes querying children of a large tree a chore, when you compare it > with something like "cat-file -p HEAD:" that allows you to grab the > needed information for all children with a single invocation. > > This is orthogonal to what the patch wants to do, which is to enrich > the output side with more formatting, bit I wonder if we want to > consider enriching the input side? e.g. instead of feeding just a > single object name from the standard input of "cat-file > --batch/--batch-check", perhaps a syntax can say "Here I have the > object name for a tree-ish object, but please pretend that I gave > you all the objects contained within it", or something? That is an interesting direction. In practice I guess you might want to expand trees (to show their contents) or perhaps commits (to traverse history and/or look at their trees). And we already have tools to do that. So for example you can already do: git ls-tree --format='%(objectname) %(objectmode)' HEAD Or if you wanted to mix-and-match with other cat-file placeholders, you can do: git ls-tree --format='%(objectname) %(objectmode)' HEAD | git cat-file --batch-check='%(objectname) %(deltabase) %(rest)' That is a little less efficient (we look up the object twice), but once you are working with hex object ids it is not too bad (cat-file is heavily optimized here). Of course in the long run I think we should move to a future where the formatting code is shared, and you can just ask ls-tree for deltabase if you want to. I think leaving this to specialized tools like ls-tree gives them a lot of flexibility that a special input mode to cat-file might find awkward. For example, recurse vs non-recursive tree listing. Or filtering with pathspecs. And of course when you get into commits and traversal, there are many rev-list options. :) The strategy so far has been making sure cat-file can efficiently take in the output of these other tools to further describe objects. But moving towards a unified output formatting model would be even better, I think. In the meantime, I think cat-file learning %(objectmode) makes sense for single names (rather than listing trees), and fortunately it uses the same (obvious) name that ls-tree does, so we won't have a problem unifying them later. The patch itself looked reasonable to me, modulo the comments you already made. -Peff
Jeff King <peff@peff.net> writes: > That is an interesting direction. In practice I guess you might want to > expand trees (to show their contents) or perhaps commits (to traverse > history and/or look at their trees). And we already have tools to do > that. > > So for example you can already do: > > git ls-tree --format='%(objectname) %(objectmode)' HEAD > > Or if you wanted to mix-and-match with other cat-file placeholders, you > can do: > > git ls-tree --format='%(objectname) %(objectmode)' HEAD | > git cat-file --batch-check='%(objectname) %(deltabase) %(rest)' > > That is a little less efficient (we look up the object twice), but once > you are working with hex object ids it is not too bad (cat-file is > heavily optimized here). Of course in the long run I think we should > move to a future where the formatting code is shared, and you can just > ask ls-tree for deltabase if you want to. I was imagining more about a use case "cat-file --batch" was originally designed for---having a long-running single process and ask any and all questions you have about various objects in the object database by interacting with it. So "yes, ls-tree can already give us that information", while it is true, shoots at a different direction from what I had in mind. > The strategy so far has been making sure cat-file can efficiently take > in the output of these other tools to further describe objects. But > moving towards a unified output formatting model would be even better, I > think. In the meantime, I think cat-file learning %(objectmode) makes > sense for single names (rather than listing trees), and fortunately it > uses the same (obvious) name that ls-tree does, so we won't have a > problem unifying them later. Yes, enriching the output format side is an orthogonal issue from the input side, and the %(objectmode) thing that gives a piece of information that is additionally available on top of the various pieces of information about the object itself does make sense. > The patch itself looked reasonable to me, modulo the comments you > already made. > > -Peff
On Tue, Mar 12, 2024 at 12:28:48PM -0700, Junio C Hamano wrote: > > Or if you wanted to mix-and-match with other cat-file placeholders, you > > can do: > > > > git ls-tree --format='%(objectname) %(objectmode)' HEAD | > > git cat-file --batch-check='%(objectname) %(deltabase) %(rest)' > > > > That is a little less efficient (we look up the object twice), but once > > you are working with hex object ids it is not too bad (cat-file is > > heavily optimized here). Of course in the long run I think we should > > move to a future where the formatting code is shared, and you can just > > ask ls-tree for deltabase if you want to. > > I was imagining more about a use case "cat-file --batch" was > originally designed for---having a long-running single process > and ask any and all questions you have about various objects in the > object database by interacting with it. So "yes, ls-tree can > already give us that information", while it is true, shoots at a > different direction from what I had in mind. Ah, yeah, that is one thing that cat-file does that no other part of the system will. I do wonder in the long term if it is easier to teach cat-file everything that all of the other commands can do, or to teach all of the other commands some way of handling multiple requests in a single process. ;) (All obviously orthogonal to this patch series). -Peff