Message ID | pull.830.git.1609506428.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Remove more index compatibility macros | expand |
On Fri, Jan 1, 2021 at 5:10 AM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > > I noticed that Duy's project around USE_THE_INDEX_COMPATIBILITY_MACROS has > been on pause for a while. Here is my attempt to continue that project a > little. > > I started going through the builtins that still use cache_name_pos() and the > first few were easy: merge-inex, mv, rm. > > Then I hit update-index and it was a bit bigger. It's included here as well. > > My strategy for update-index was to create static globals "repo" and > "istate" that point to the_repository and the_index, respectively. Then, I > was able to remove macros one-by-one without changing method prototypes > within the file. > > I had started trying to keep everything local to the method signatures, but > I hit a snag when reaching the command-line parsing callbacks, which I could > not modify their call signature. At that point, I had something that was > already much more complicated than what I present now. Outside of the first > update-index commit, everything was a mechanical find/replace. > > In total, this allows us to remove four of the compatibility macros because > they are no longer used. This series is divided nicely in a way that makes review easy. I've made some of these same types of changes elsewhere, and the whole series is really just a long sequence of mechanical changes (plus a case or two of fixing formatting on a line you were changing anyway, such as adding spaces around an operator). I think the work to continue dropping the implicit dependency on the_index is helpful. I only found two minor suggestions for improving the commit messages; the patches all look good to me.
On Fri, Jan 1, 2021 at 8:09 AM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > My strategy for update-index was to create static globals "repo" and > "istate" that point to the_repository and the_index, respectively. Then, I > was able to remove macros one-by-one without changing method prototypes > within the file. > > I had started trying to keep everything local to the method signatures, but > I hit a snag when reaching the command-line parsing callbacks, which I could > not modify their call signature. [...] You should be able to do this, not by modifying the callback signature, but by taking advantage of the `extra` member of `struct option` which is available to callback functions or arbitrary use. If you need to access the index in a callback, then assign a `struct index_state *` to `extra`; likewise assign a `struct repository *` to `extra` to access the repository. If you need access to both the index and the repository, then just store the repository in `extra` since the repository has an `index` field. You won't be able to use any of the canned OPT_FOO() macros to initialize an entry in the update-index.c `options[]` array which needs `extra`-initialization since the macros don't let you specify `extra`, but you can easily bypass the macro and initialize the `struct option` manually. (After all, the macros exist for convenience; they are not a hard requirement.) Within the callback, extract the `repository` or `index_state` as you would any other field. For instance: const struct repository *repo = opt->extra; This should allow you to get rid of the globals introduced by patch [4/12] (assuming passing the index and repo arguments around everywhere doesn't get overly hairy).
On 1/2/2021 1:12 AM, Eric Sunshine wrote: > On Fri, Jan 1, 2021 at 8:09 AM Derrick Stolee via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> My strategy for update-index was to create static globals "repo" and >> "istate" that point to the_repository and the_index, respectively. Then, I >> was able to remove macros one-by-one without changing method prototypes >> within the file. >> >> I had started trying to keep everything local to the method signatures, but >> I hit a snag when reaching the command-line parsing callbacks, which I could >> not modify their call signature. [...] > > You should be able to do this, not by modifying the callback > signature, but by taking advantage of the `extra` member of `struct > option` which is available to callback functions or arbitrary use. If > you need to access the index in a callback, then assign a `struct > index_state *` to `extra`; likewise assign a `struct repository *` to > `extra` to access the repository. If you need access to both the index > and the repository, then just store the repository in `extra` since > the repository has an `index` field. > > You won't be able to use any of the canned OPT_FOO() macros to > initialize an entry in the update-index.c `options[]` array which > needs `extra`-initialization since the macros don't let you specify > `extra`, but you can easily bypass the macro and initialize the > `struct option` manually. (After all, the macros exist for > convenience; they are not a hard requirement.) > > Within the callback, extract the `repository` or `index_state` as you > would any other field. For instance: > > const struct repository *repo = opt->extra; Yes, this is definitely the way to make it possible. > This should allow you to get rid of the globals introduced by patch > [4/12] (assuming passing the index and repo arguments around > everywhere doesn't get overly hairy). My attempts just getting to the point of hitting these callbacks was already making me frustrated with how complicated the code became with that approach. Perhaps now that I've removed the compatibility macros, it would be easier to insert the method parameters since most of the lines that need to change would be method prototypes and the calls to those methods (plus the callback function details). Is that a valuable effort? I could give it a try, but I want to be sure that adjusting all of those helper methods in the builtin would indeed have valuable improvements over the static globals used here. Thanks, -Stolee
On Sun, Jan 3, 2021 at 8:01 PM Derrick Stolee <stolee@gmail.com> wrote: > On 1/2/2021 1:12 AM, Eric Sunshine wrote: > > This should allow you to get rid of the globals introduced by patch > > [4/12] (assuming passing the index and repo arguments around > > everywhere doesn't get overly hairy). > > Perhaps now that I've removed the compatibility macros, it would be > easier to insert the method parameters since most of the lines that > need to change would be method prototypes and the calls to those methods > (plus the callback function details). > > Is that a valuable effort? I could give it a try, but I want to be sure > that adjusting all of those helper methods in the builtin would indeed > have valuable improvements over the static globals used here. My impression was that the goal of the earlier work was to pass the index and repository to each function specifically to avoid tying the function to a particular index or repository. This helps in cases in which client code needs to operate on a different index or repository (for instance, a submodule). Generally speaking, making the index and repository file-static rather than global does not help reach that goal since functions are still tied to state which is not local to the function itself. Would the extra effort be valuable in this particular case? I'm not familiar with this code, but given that `update-index` is a builtin, such effort may not be too meaningful. If, however, any of the code from `buildin/update-index.c` ever gets "libified" and moved into the core library, then that would be a good time to update the functions to take those values as arguments rather than relying on file-static or globals. But that's not something that this series necessarily needs to do; the task can wait until the code needs to be shared by other modules, I would think.
On 1/4/2021 1:22 AM, Eric Sunshine wrote: > On Sun, Jan 3, 2021 at 8:01 PM Derrick Stolee <stolee@gmail.com> wrote: >> On 1/2/2021 1:12 AM, Eric Sunshine wrote: >>> This should allow you to get rid of the globals introduced by patch >>> [4/12] (assuming passing the index and repo arguments around >>> everywhere doesn't get overly hairy). >> >> Perhaps now that I've removed the compatibility macros, it would be >> easier to insert the method parameters since most of the lines that >> need to change would be method prototypes and the calls to those methods >> (plus the callback function details). >> >> Is that a valuable effort? I could give it a try, but I want to be sure >> that adjusting all of those helper methods in the builtin would indeed >> have valuable improvements over the static globals used here. > > My impression was that the goal of the earlier work was to pass the > index and repository to each function specifically to avoid tying the > function to a particular index or repository. This helps in cases in > which client code needs to operate on a different index or repository > (for instance, a submodule). Generally speaking, making the index and > repository file-static rather than global does not help reach that > goal since functions are still tied to state which is not local to the > function itself. > > Would the extra effort be valuable in this particular case? I'm not > familiar with this code, but given that `update-index` is a builtin, > such effort may not be too meaningful. If, however, any of the code > from `buildin/update-index.c` ever gets "libified" and moved into the > core library, then that would be a good time to update the functions > to take those values as arguments rather than relying on file-static > or globals. But that's not something that this series necessarily > needs to do; the task can wait until the code needs to be shared by > other modules, I would think. I tried again tonight, and it started getting messy, but then I realized that I could group the callbacks that need the repo and index to use a common struct that holds the other parameters they need. It's still a bigger patch than I'd like, but it is more reasonable. v2 is incoming with my attempt at this. Thanks, -Stolee
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > My strategy for update-index was to create static globals "repo" and > "istate" that point to the_repository and the_index, respectively. Then, I > was able to remove macros one-by-one without changing method prototypes > within the file. Knee-jerk reaction: swapping one pair of global with another? Would that give us enough upside? It may allow some codepaths involved to work on an in-core index instance that is different from the_index, but does not make them reentrant. Do we now have callers that actually pass an in-core index instance that is different from the_index, and more importantly, that fail loudly if the codepaths involved in this conversion forgets to update some accesses to the_index not to the specified one? If not, ... > In total, this allows us to remove four of the compatibility macros because > they are no longer used. ... a conversion like this, removing the use of the compatibility macros for the sake of removing them, invites future headaches by leaving untested code churn behind with potential bugs that will only get discovered after somebody actually starts making calls with the non-default in-core index instances. I've come to know the competence of you well enough to trust your patches like patches from other proficient, prolific and prominent contributors (I won't name names, but you know who you are), but we are all human and are prone to introduce bugs. That's all my knee-jerk impression before actually reading the series through, though. I'll certainloy know more after reading them. Thanks. > Derrick Stolee (12): > merge-index: drop index compatibility macros > mv: remove index compatibility macros > rm: remove compatilibity macros > update-index: drop the_index, the_repository > update-index: use istate->cache over active_cache > update-index: use index->cache_nr over active_nr > update-index: use istate->cache_changed > update-index: use index_name_pos() over cache_name_pos() > update-index: use remove_file_from_index() > update-index: use add_index_entry() > update-index: replace several compatibility macros > update-index: remove ce_match_stat(), all macros > > Documentation/technical/racy-git.txt | 6 +- > builtin/merge-index.c | 33 +++--- > builtin/mv.c | 42 ++++---- > builtin/rm.c | 56 ++++++----- > builtin/update-index.c | 145 ++++++++++++++------------- > cache.h | 4 - > 6 files changed, 149 insertions(+), 137 deletions(-) > > > base-commit: 71ca53e8125e36efbda17293c50027d31681a41f > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-830%2Fderrickstolee%2Findex-compatibility-1-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-830/derrickstolee/index-compatibility-1-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/830
On 1/5/2021 10:55 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> My strategy for update-index was to create static globals "repo" and >> "istate" that point to the_repository and the_index, respectively. Then, I >> was able to remove macros one-by-one without changing method prototypes >> within the file. > > Knee-jerk reaction: swapping one pair of global with another? Would > that give us enough upside? It may allow some codepaths involved to > work on an in-core index instance that is different from the_index, > but does not make them reentrant. My intention was to reduce the use of globals in libgit.a while keeping with existing patterns of static globals in the builtin code. While this can be thought of "module variables" instead of true globals, they aren't exactly desirable. In v2, these static globals are temporary to the series and are completely removed by the end. The new patch sequence can hopefully be seen as "this preprocessor macro was expanded" and then "static globals are replaced with method parameters" which are pretty straightforward. > Do we now have callers that actually pass an in-core index instance > that is different from the_index, and more importantly, that fail > loudly if the codepaths involved in this conversion forgets to > update some accesses to the_index not to the specified one? > > If not, ... > >> In total, this allows us to remove four of the compatibility macros because >> they are no longer used. > > ... a conversion like this, removing the use of the compatibility > macros for the sake of removing them, invites future headaches by > leaving untested code churn behind with potential bugs that will > only get discovered after somebody actually starts making calls > with the non-default in-core index instances. Perhaps I had misunderstood the state of the conversion project. I thought that the full conversion was just paused because Duy moved on to other things. I thought it might be valuable to pick up the baton while also thinking about the space. If this is _not_ a valuable project to continue, then I can hold off for now. Unfortunately, we'll never know if everything is safe from assuming the_index until the macro itself is gone. It helps that libgit.a doesn't use it at > I've come to know the competence of you well enough to trust your > patches like patches from other proficient, prolific and prominent > contributors (I won't name names, but you know who you are), but we > are all human and are prone to introduce bugs. That means a lot, thanks. And yes, I'm well aware that bugs can be introduced. I've added my share. > That's all my knee-jerk impression before actually reading the > series through, though. I'll certainloy know more after reading > them. I look forward to your thoughts. Thanks, -Stolee
Derrick Stolee <stolee@gmail.com> writes: > My intention was to reduce the use of globals in libgit.a while keeping > with existing patterns of static globals in the builtin code. The above (without having thought about it too hard or long enough yet) sounds a sensible guideline to go by. Thanks for a helpful hint to keep in mind while reading the series. > While > this can be thought of "module variables" instead of true globals, they > aren't exactly desirable. In v2, these static globals are temporary to > the series and are completely removed by the end. ;-)