Message ID | pull.1632.v4.git.1707196348.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Enrich Trailer API | expand |
On Tue, Feb 6, 2024 at 6:12 AM Linus Arver via GitGitGadget <gitgitgadget@gmail.com> wrote: > > This patch series is the first 10 patches of a larger cleanup/bugfix series > (henceforth "larger series") I've been working on. There are now 28 patches in this series. I took a look at all of them, and I think that this series should be split into 4 or more series. For example perhaps one series until the "trailer: move interpret_trailers() to interpret-trailers.c" patch, then another one until "trailer: finish formatting unification". etc. Also I think it might be possible to avoid some test failures introduced by some patches. If it's not possible, I agree with Junio that it would be nice if the failing tests were changed to use 'test_expect_failure'. Also it seems to me that many patches towards the end of this series should be split. > The main goal of this > series is to begin the process of "libifying" the trailer API. By "API" I > mean the interface exposed in trailer.h. The larger series brings a number > of additional cleanups (exposing and fixing some bugs along the way), and > builds on top of this series. [...] > With the libification-focused goals out of the way, let's turn to this patch > series in more detail. I like the goal of libifying Git the trailer API, and the way you want to do it seems reasonable to me. [...] > In summary this series breaks up "process_trailers()" into smaller pieces, > exposing many of the parts relevant to trailer-related processing in > trailer.h. This will force us to eventually introduce unit tests for these > API functions, but that is a good thing for API stability. I am a bit sad that this series doesn't introduce unit tests using the new test framework in C yet. I understand that this series is mostly a big refactoring and maybe it's better to introduce unit tests only when the refactoring is finished though. Anyway I hope the next series will introduce such tests. > In the future after libification is "complete", users external to Git will > be able to use the same trailer processing API used by the > interpret-trailers builtin. For example, a web server may want to parse > trailers the same way that Git would parse them, without having to call > interpret-trailers as a subprocess. This use case was the original > motivation behind my work in this area. Thanks for telling us about this use case. > Thanks to the aggressive refactoring in this series, I've been able to > identify and fix several bugs in our existing implementation. Those fixes > build on top of this series but were not included here, in order to keep > this series small. Below is a "shortlog" of those fixes I have locally: > > * "trailer: trailer replacement should not change its position" (If we > found a trailer we'd like to replace, preserve its position relative to > the other trailers found in the trailer block, instead of always moving > it to the beginning or end of the entire trailer block.) I believe there was a reason why it was done this way. I don't remember what it was though. > * "interpret-trailers: preserve trailers coming from the input" (Sometimes, > the parsed trailers from the input will be formatted differently > depending on whether we provide --only-trailers or not. Make the trailers > that were not modified and which are coming directly from the input get > formatted the same way, regardless of this flag.) It could be a feature to be able to normalize trailers in a certain way. > * "interpret-trailers: do not modify the input if NOP" (Refrain from > subtracting or adding a newline around the patch divider "---" if we are > not adding new trailers.) It could be a feature to be able to normalize this too. > * "trailer formatter: split up format_trailer() monolith" (Fix a bug in > git-log where we still printed a blank newline even if we didn't want to > format anything.) I am not sure this is a bug fix either. It could perhaps be a normalization too. > * "interpret-trailers: fail if given unrecognized arguments" (E.g., for > "--where", only accept recognized WHERE_* enum values. If we get > something unrecognized, fail with an error instead of silently doing > nothing. Ditto for "--if-exists" and "--if-missing".) It's possible that there was a reason why it was done this way. I think you might want to take a look at the discussions on the mailing list when "interpret-trailers" was developed. There were a lot of discussions over a long time, and there were a lot of requests and suggestions about what it should do.
Christian Couder <christian.couder@gmail.com> writes: > On Tue, Feb 6, 2024 at 6:12 AM Linus Arver via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> >> This patch series is the first 10 patches of a larger cleanup/bugfix series >> (henceforth "larger series") I've been working on. > > There are now 28 patches in this series. > > I took a look at all of them, and I think that this series should be > split into 4 or more series. I presume that [01-09/28] would be the first part, nothing controversial and consisting of obvious clean-ups? I do not mind merging that part down to remove the future review load if everybody agrees. Thanks.
On Tue, Feb 13, 2024 at 1:11 AM Junio C Hamano <gitster@pobox.com> wrote: > > Christian Couder <christian.couder@gmail.com> writes: > > > On Tue, Feb 6, 2024 at 6:12 AM Linus Arver via GitGitGadget > > <gitgitgadget@gmail.com> wrote: > >> > >> This patch series is the first 10 patches of a larger cleanup/bugfix series > >> (henceforth "larger series") I've been working on. > > > > There are now 28 patches in this series. > > > > I took a look at all of them, and I think that this series should be > > split into 4 or more series. > > I presume that [01-09/28] would be the first part, nothing > controversial and consisting of obvious clean-ups? I do not mind > merging that part down to remove the future review load if everybody > agrees. Yeah, patches [01-09/28] look good to me. If we are merging them without the rest, we might want to change a bit the last sentence in 09/28 which says: "In the next patch, we'll change format_trailer_info() to use the parsed trailer_item objects instead of the string array." Maybe just: s/In the next patch/In a future patch/
Christian Couder <christian.couder@gmail.com> writes: >> > I took a look at all of them, and I think that this series should be >> > split into 4 or more series. >> >> I presume that [01-09/28] would be the first part, nothing >> controversial and consisting of obvious clean-ups? I do not mind >> merging that part down to remove the future review load if everybody >> agrees. > > Yeah, patches [01-09/28] look good to me. I was hoping that you'll give us more details of what the other 3 or more you would envision the series to be, actually.
Christian Couder <christian.couder@gmail.com> writes: > On Tue, Feb 6, 2024 at 6:12 AM Linus Arver via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> >> This patch series is the first 10 patches of a larger cleanup/bugfix series >> (henceforth "larger series") I've been working on. > > There are now 28 patches in this series. > > I took a look at all of them, and I think that this series should be > split into 4 or more series. This sounds fine to me. IIRC this means for the 2nd, 3rd+ series I have to remember to say "this series builds on top of the other topic branches '...'" in the cover letter. Now that I've written this out I will hopefully not forget to do this... Or, I suppose I could just introduce the 1st sub-series, wait for that to get queued to next, then (re)introduce the 2nd sub-series, etc, in order. Hmm. I think this will be simpler. > For example perhaps one series until the "trailer: move > interpret_trailers() to interpret-trailers.c" patch, then another one > until "trailer: finish formatting unification". etc. > > Also I think it might be possible to avoid some test failures > introduced by some patches. If it's not possible, I agree with Junio > that it would be nice if the failing tests were changed to use > 'test_expect_failure'. As the breakages are deliberate, I will have to go with using "test_expect_failure". > Also it seems to me that many patches towards the end of this series > should be split. In hindsight, I fully agree. Aside: I am delighted with the quality of reviews on this project. It's not something I am used to, so please bear with me while I try to break old habits. Thanks. >> The main goal of this >> series is to begin the process of "libifying" the trailer API. By "API" I >> mean the interface exposed in trailer.h. The larger series brings a number >> of additional cleanups (exposing and fixing some bugs along the way), and >> builds on top of this series. > > [...] > >> With the libification-focused goals out of the way, let's turn to this patch >> series in more detail. > > I like the goal of libifying Git the trailer API, and the way you want > to do it seems reasonable to me. > > [...] > >> In summary this series breaks up "process_trailers()" into smaller pieces, >> exposing many of the parts relevant to trailer-related processing in >> trailer.h. This will force us to eventually introduce unit tests for these >> API functions, but that is a good thing for API stability. > > I am a bit sad that this series doesn't introduce unit tests using the > new test framework in C yet. I understand that this series is mostly a > big refactoring and maybe it's better to introduce unit tests only > when the refactoring is finished though. This was my original goal as well, and still is. > Anyway I hope the next series will introduce such tests. I will see which API functions are stable enough, and add tests accordingly (in a patch series sooner than later). Probably the "biggest" (?) thing that is coming from the larger series is the introduction of a complete separation between parsing (without any modification of the input) and formatting. The parser/formatter is a large chunk of the trailer implementation, so I would expect unit tests for those bits to have to wait until those improvements are merged into "next". > [...] > >> Thanks to the aggressive refactoring in this series, I've been able to >> identify and fix several bugs in our existing implementation. Those fixes >> build on top of this series but were not included here, in order to keep >> this series small. Below is a "shortlog" of those fixes I have locally: >> >> * "trailer: trailer replacement should not change its position" (If we >> found a trailer we'd like to replace, preserve its position relative to >> the other trailers found in the trailer block, instead of always moving >> it to the beginning or end of the entire trailer block.) > > I believe there was a reason why it was done this way. I don't > remember what it was though. Noted. I'll see what I can find in our commit history. >> * "interpret-trailers: preserve trailers coming from the input" (Sometimes, >> the parsed trailers from the input will be formatted differently >> depending on whether we provide --only-trailers or not. Make the trailers >> that were not modified and which are coming directly from the input get >> formatted the same way, regardless of this flag.) > > It could be a feature to be able to normalize trailers in a certain way. True. But doing such normalization silently is undocumented behavior, and we should provide explicit flags for this sort of thing. Adding such flags might be the right thing to do (let's discuss more when this patch gets proposed). FWIW the behavior I observed is that this normalization only happens for *some* trailers that have configuration options, not all trailers in the input. So it's a special kind of (limited) normalization. >> * "interpret-trailers: do not modify the input if NOP" (Refrain from >> subtracting or adding a newline around the patch divider "---" if we are >> not adding new trailers.) > > It could be a feature to be able to normalize this too. OK, but it would again be undocumented behavior. >> * "trailer formatter: split up format_trailer() monolith" (Fix a bug in >> git-log where we still printed a blank newline even if we didn't want to >> format anything.) > > I am not sure this is a bug fix either. It could perhaps be a normalization too. See my comment above. >> * "interpret-trailers: fail if given unrecognized arguments" (E.g., for >> "--where", only accept recognized WHERE_* enum values. If we get >> something unrecognized, fail with an error instead of silently doing >> nothing. Ditto for "--if-exists" and "--if-missing".) > > It's possible that there was a reason why it was done this way. > > I think you might want to take a look at the discussions on the > mailing list when "interpret-trailers" was developed. There were a lot > of discussions over a long time, and there were a lot of requests and > suggestions about what it should do. I confess I haven't looked too deeply into the original threads surrounding the introduction of "interpret-trailers". But all of the changes which I categorize as bugfixes above have a theme of undocumented modifications. While working on this (and the larger) series around trailers, I only looked into some (not all) of the discussions on the mailing list in this area. Instead, I deferred to Documentation/git-interpret-trailers.txt as the official (authoritative) source of truth. This is partly why I first started out on this project last year by making improvements to that doc. And, this is why seeing so many edge cases where we perform such undocumented modifications smelled more like bugs to me than features. That being said, I cannot disagree with your position that the so-called bugfixes I've reported above could be misguided (and should rather be just updates to missing documentation). Let's not try to decide that here, but instead later when I get to introduce those changes on the list, one at a time. Thanks.
Linus Arver <linusa@google.com> writes: > This sounds fine to me. IIRC this means for the 2nd, 3rd+ series I have > to remember to say "this series builds on top of the other topic > branches '...'" in the cover letter. Now that I've written this out I > will hopefully not forget to do this... > > Or, I suppose I could just introduce the 1st sub-series, wait for that > to get queued to next, then (re)introduce the 2nd sub-series, etc, in > order. Hmm. I think this will be simpler. FWIW, which was how the recent flurry of topics related to the reftable backend were done by Patrick, which was quite nice. People certainly need the feel of larger picture to get motivated to review an earlier part of the series, but now that they saw the projected end-game, splitting these into 4 series, each with materials in 7 or so patches in v4, and presenting in not-so-quick succession one by one, would make it less distracting and less daunting. >> I am a bit sad that this series doesn't introduce unit tests using the >> new test framework in C yet. I understand that this series is mostly a >> big refactoring and maybe it's better to introduce unit tests only >> when the refactoring is finished though. > > This was my original goal as well, and still is. That's OK. If v4 were 40-patch series instead of 28, I am sure you would have had the unit-test part near the end. So a bite sized series of serieses may not show the unit-tests while the earlier batches are still cooking, but as long as we all are aiming for that same goal, we are fine. Let's help ourselves get there soon by reviewing each other's patches ;-). Thanks, both.
On Tue, Feb 13, 2024, at 20:57, Junio C Hamano wrote: > Linus Arver <linusa@google.com> writes: > >> This sounds fine to me. IIRC this means for the 2nd, 3rd+ series I have >> to remember to say "this series builds on top of the other topic >> branches '...'" in the cover letter. Now that I've written this out I >> will hopefully not forget to do this... >> >> Or, I suppose I could just introduce the 1st sub-series, wait for that >> to get queued to next, then (re)introduce the 2nd sub-series, etc, in >> order. Hmm. I think this will be simpler. > > FWIW, which was how the recent flurry of topics related to the > reftable backend were done by Patrick, which was quite nice. People > certainly need the feel of larger picture to get motivated to review > an earlier part of the series, but now that they saw the projected > end-game, splitting these into 4 series, each with materials in 7 or > so patches in v4, and presenting in not-so-quick succession one by > one, would make it less distracting and less daunting. I feel like I’m learning some things about how version control programs (and accompanying review software (here email)) can be vital—not just helpful, not just a nice-to-have—for structuring reviews of large changes from observing these conversations. Thanks to both/all.
(Sorry for sending this previously to Junio only) On Tue, Feb 13, 2024 at 6:30 PM Junio C Hamano <gitster@pobox.com> wrote: > > Christian Couder <christian.couder@gmail.com> writes: > >> I presume that [01-09/28] would be the first part, nothing > >> controversial and consisting of obvious clean-ups? I do not mind > >> merging that part down to remove the future review load if everybody > >> agrees. > > > > Yeah, patches [01-09/28] look good to me. > > I was hoping that you'll give us more details of what the other 3 or > more you would envision the series to be, actually. I think the next one could be [10-16/28], so until "trailer: finish formatting unification". Then I am not sure about the next one, perhaps [17-20/28] or [17-21/28]. The rest would depend on the splitting of the big patches towards the end of the series.
On Tue, Feb 13, 2024 at 8:39 PM Linus Arver <linusa@google.com> wrote: > > Christian Couder <christian.couder@gmail.com> writes: > > I took a look at all of them, and I think that this series should be > > split into 4 or more series. > > This sounds fine to me. IIRC this means for the 2nd, 3rd+ series I have > to remember to say "this series builds on top of the other topic > branches '...'" in the cover letter. Now that I've written this out I > will hopefully not forget to do this... > > Or, I suppose I could just introduce the 1st sub-series, wait for that > to get queued to next, then (re)introduce the 2nd sub-series, etc, in > order. Hmm. I think this will be simpler. Yeah, sure. > > Also it seems to me that many patches towards the end of this series > > should be split. > > In hindsight, I fully agree. > > Aside: I am delighted with the quality of reviews on this project. It's > not something I am used to, so please bear with me while I try to break > old habits. Sure no worries. > Thanks. [...] > > I am a bit sad that this series doesn't introduce unit tests using the > > new test framework in C yet. I understand that this series is mostly a > > big refactoring and maybe it's better to introduce unit tests only > > when the refactoring is finished though. > > This was my original goal as well, and still is. > > > Anyway I hope the next series will introduce such tests. > > I will see which API functions are stable enough, and add tests > accordingly (in a patch series sooner than later). > > Probably the "biggest" (?) thing that is coming from the larger series > is the introduction of a complete separation between parsing (without > any modification of the input) and formatting. The parser/formatter is > a large chunk of the trailer implementation, so I would expect unit > tests for those bits to have to wait until those improvements are merged > into "next". Ok. > >> Thanks to the aggressive refactoring in this series, I've been able to > >> identify and fix several bugs in our existing implementation. Those fixes > >> build on top of this series but were not included here, in order to keep > >> this series small. Below is a "shortlog" of those fixes I have locally: > >> > >> * "trailer: trailer replacement should not change its position" (If we > >> found a trailer we'd like to replace, preserve its position relative to > >> the other trailers found in the trailer block, instead of always moving > >> it to the beginning or end of the entire trailer block.) > > > > I believe there was a reason why it was done this way. I don't > > remember what it was though. > > Noted. I'll see what I can find in our commit history. > > >> * "interpret-trailers: preserve trailers coming from the input" (Sometimes, > >> the parsed trailers from the input will be formatted differently > >> depending on whether we provide --only-trailers or not. Make the trailers > >> that were not modified and which are coming directly from the input get > >> formatted the same way, regardless of this flag.) > > > > It could be a feature to be able to normalize trailers in a certain way. > > True. But doing such normalization silently is undocumented behavior, > and we should provide explicit flags for this sort of thing. Adding such > flags might be the right thing to do (let's discuss more when this patch > gets proposed). FWIW the behavior I observed is that this normalization > only happens for *some* trailers that have configuration options, not > all trailers in the input. So it's a special kind of (limited) > normalization. Perhaps because we consider that having some configuration means that the user consistently expects things in a certain way. > >> * "interpret-trailers: fail if given unrecognized arguments" (E.g., for > >> "--where", only accept recognized WHERE_* enum values. If we get > >> something unrecognized, fail with an error instead of silently doing > >> nothing. Ditto for "--if-exists" and "--if-missing".) > > > > It's possible that there was a reason why it was done this way. > > > > I think you might want to take a look at the discussions on the > > mailing list when "interpret-trailers" was developed. There were a lot > > of discussions over a long time, and there were a lot of requests and > > suggestions about what it should do. > > I confess I haven't looked too deeply into the original threads > surrounding the introduction of "interpret-trailers". But all of the > changes which I categorize as bugfixes above have a theme of > undocumented modifications. > > While working on this (and the larger) series around trailers, I only > looked into some (not all) of the discussions on the mailing list in > this area. Instead, I deferred to > Documentation/git-interpret-trailers.txt as the official (authoritative) > source of truth. This is partly why I first started out on this project > last year by making improvements to that doc. And, this is why seeing so > many edge cases where we perform such undocumented modifications smelled > more like bugs to me than features. > > That being said, I cannot disagree with your position that the so-called > bugfixes I've reported above could be misguided (and should rather be > just updates to missing documentation). Let's not try to decide that > here, but instead later when I get to introduce those changes on the > list, one at a time. Thanks. Yeah, it might seem like undocumented features are bad, and I agree that reading original discussions can be tiring. But if the latter makes it possible to fix undocumented features by just properly documenting them, then I think it might just be the best thing to do. Ok not to decide about it now though.
Christian Couder <christian.couder@gmail.com> writes: > On Tue, Feb 13, 2024 at 8:39 PM Linus Arver <linusa@google.com> wrote: >> >> Christian Couder <christian.couder@gmail.com> writes: > > [...] > >> >> Thanks to the aggressive refactoring in this series, I've been able to >> >> identify and fix several bugs in our existing implementation. Those fixes >> >> build on top of this series but were not included here, in order to keep >> >> this series small. Below is a "shortlog" of those fixes I have locally: >> >> >> >> * "trailer: trailer replacement should not change its position" (If we >> >> found a trailer we'd like to replace, preserve its position relative to >> >> the other trailers found in the trailer block, instead of always moving >> >> it to the beginning or end of the entire trailer block.) >> > >> > I believe there was a reason why it was done this way. I don't >> > remember what it was though. >> >> Noted. I'll see what I can find in our commit history. >> >> >> * "interpret-trailers: preserve trailers coming from the input" (Sometimes, >> >> the parsed trailers from the input will be formatted differently >> >> depending on whether we provide --only-trailers or not. Make the trailers >> >> that were not modified and which are coming directly from the input get >> >> formatted the same way, regardless of this flag.) >> > >> > It could be a feature to be able to normalize trailers in a certain way. >> >> True. But doing such normalization silently is undocumented behavior, >> and we should provide explicit flags for this sort of thing. Adding such >> flags might be the right thing to do (let's discuss more when this patch >> gets proposed). FWIW the behavior I observed is that this normalization >> only happens for *some* trailers that have configuration options, not >> all trailers in the input. So it's a special kind of (limited) >> normalization. > > Perhaps because we consider that having some configuration means that > the user consistently expects things in a certain way. Yes, this was one possibility I considered after sending my reply. If a user has gone out of their way to configure something, maybe they do want things (for those bits) to be normalized. And adding a flag to disable normalization seems like a good feature to have also (while keeping the behavior of the interpret-trailers that has been relatively untouched since its introduction). But anyway I'm getting a little bit ahead of myself. >> >> * "interpret-trailers: fail if given unrecognized arguments" (E.g., for >> >> "--where", only accept recognized WHERE_* enum values. If we get >> >> something unrecognized, fail with an error instead of silently doing >> >> nothing. Ditto for "--if-exists" and "--if-missing".) >> > >> > It's possible that there was a reason why it was done this way. >> > >> > I think you might want to take a look at the discussions on the >> > mailing list when "interpret-trailers" was developed. There were a lot >> > of discussions over a long time, and there were a lot of requests and >> > suggestions about what it should do. >> >> I confess I haven't looked too deeply into the original threads >> surrounding the introduction of "interpret-trailers". But all of the >> changes which I categorize as bugfixes above have a theme of >> undocumented modifications. >> >> While working on this (and the larger) series around trailers, I only >> looked into some (not all) of the discussions on the mailing list in >> this area. Instead, I deferred to >> Documentation/git-interpret-trailers.txt as the official (authoritative) >> source of truth. This is partly why I first started out on this project >> last year by making improvements to that doc. And, this is why seeing so >> many edge cases where we perform such undocumented modifications smelled >> more like bugs to me than features. >> >> That being said, I cannot disagree with your position that the so-called >> bugfixes I've reported above could be misguided (and should rather be >> just updates to missing documentation). Let's not try to decide that >> here, but instead later when I get to introduce those changes on the >> list, one at a time. Thanks. > > Yeah, it might seem like undocumented features are bad, and I agree > that reading original discussions can be tiring. But if the latter > makes it possible to fix undocumented features by just properly > documenting them, then I think it might just be the best thing to do. > Ok not to decide about it now though. Thanks!
Christian Couder <christian.couder@gmail.com> writes: > (Sorry for sending this previously to Junio only) > > On Tue, Feb 13, 2024 at 6:30 PM Junio C Hamano <gitster@pobox.com> wrote: >> >> Christian Couder <christian.couder@gmail.com> writes: > >> >> I presume that [01-09/28] would be the first part, nothing >> >> controversial and consisting of obvious clean-ups? I do not mind >> >> merging that part down to remove the future review load if everybody >> >> agrees. >> > >> > Yeah, patches [01-09/28] look good to me. >> >> I was hoping that you'll give us more details of what the other 3 or >> more you would envision the series to be, actually. > > I think the next one could be [10-16/28], so until "trailer: finish > formatting unification". > > Then I am not sure about the next one, perhaps [17-20/28] or [17-21/28]. > > The rest would depend on the splitting of the big patches towards the > end of the series. Ack, I'll try to group them like this. Thanks.