Message ID | xmqq8scymmo1.fsf@gitster.c.googlers.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 9af3360572069b937a146cbf552a12d7e24b7c91 |
Headers | show |
Series | [RFH] sequencer: simplify logic around stopped-sha file | expand |
Hi Junio, On Thu, 24 Sep 2020, Junio C Hamano wrote: > While I was auditing the use of get_oid_committish(), I noticed that > an abbreviated object name is written to rebase-merge/stopped-sha > only to be read later and expanded to a full object name to be > passed to record_in_rewritten(). Nobody seems to expose this to the > end-user and it is unclear if there is a point in keeping it short > by abbreviating and risking to make it ambiguous as the rebase > progresses and creates new objects. > > The attached patch tries to simplify the logic involved around this > file, to write and read full object names to/from the file, and the > result seems to pass testsuite---which in the ideal world would be > sufficient signal that this change is safe and sane, but it could be > that original authors thought that a change to stop abbreviating is > nonsense and not worth protecting the code against, hence RFH here. > > Is there something obvious I am not seeing that makes this change a > bad idea (other than "somebody may be in the middle of a rebase and > all of a sudden, version of Git gets updated to contain this one, > which is unable to read abbreviated object name the current version > left on disk", which I am deliberately ignoring)? At least in my understanding, you are not missing anything: - this file is an implementation detail, - it is not exposed directly via any user-visible interface, - any reader will _have_ to be prepared for an unabbreviated object ID (in the highly unlikely case that an object ID would be ambiguous if abbreviated even by one hex character), - and most importantly: just like we expand the commit IDs in the todo list, we actually want to expand them in `stopped-sha` because it _is_ possible that a new object is written that makes the previous unambiguously abbreviated object ID now ambiguous (e.g. when the user commits in a separate worktree while the rebase is interrupted, before continuing the rebase). In short: ACK Thank you, Dscho > > Thanks. > > > sequencer.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/sequencer.c b/sequencer.c > index fd7701c88a..7dc9088d09 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -120,7 +120,7 @@ static GIT_PATH_FUNC(rebase_path_author_script, "rebase-merge/author-script") > static GIT_PATH_FUNC(rebase_path_amend, "rebase-merge/amend") > /* > * When we stop at a given patch via the "edit" command, this file contains > - * the abbreviated commit name of the corresponding patch. > + * the commit object name of the corresponding patch. > */ > static GIT_PATH_FUNC(rebase_path_stopped_sha, "rebase-merge/stopped-sha") > /* > @@ -3012,11 +3012,12 @@ static int make_patch(struct repository *r, > { > struct strbuf buf = STRBUF_INIT; > struct rev_info log_tree_opt; > - const char *subject, *p; > + const char *subject; > + char hex[GIT_MAX_HEXSZ + 1]; > int res = 0; > > - p = short_commit_name(commit); > - if (write_message(p, strlen(p), rebase_path_stopped_sha(), 1) < 0) > + oid_to_hex_r(hex, &commit->object.oid); > + if (write_message(hex, strlen(hex), rebase_path_stopped_sha(), 1) < 0) > return -1; > res |= write_rebase_head(&commit->object.oid); > > @@ -4396,7 +4397,7 @@ int sequencer_continue(struct repository *r, struct replay_opts *opts) > > if (read_oneliner(&buf, rebase_path_stopped_sha(), > READ_ONELINER_SKIP_IF_EMPTY) && > - !get_oid_committish(buf.buf, &oid)) > + !get_oid_hex(buf.buf, &oid)) > record_in_rewritten(&oid, peek_command(&todo_list, 0)); > strbuf_release(&buf); > } >
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> Is there something obvious I am not seeing that makes this change a >> bad idea (other than "somebody may be in the middle of a rebase and >> all of a sudden, version of Git gets updated to contain this one, >> which is unable to read abbreviated object name the current version >> left on disk", which I am deliberately ignoring)? > > At least in my understanding, you are not missing anything: > > - this file is an implementation detail, > > - it is not exposed directly via any user-visible interface, > > - any reader will _have_ to be prepared for an unabbreviated object ID (in > the highly unlikely case that an object ID would be ambiguous if > abbreviated even by one hex character), Ah, no that was not what I was worried about. If an existing code, after restarting and reading the abbreviated object name back from the file, compares that shortened string against list of shortened strings it has in-core (perhaps it reads todo back from the user in abbreviated form, does comparison with stopped-sha as-is, before it expands the object names from todo to full length), this change would break it. > - and most importantly: just like we expand the commit IDs in the todo > list, we actually want to expand them in `stopped-sha` because it _is_ > possible that a new object is written that makes the previous > unambiguously abbreviated object ID now ambiguous (e.g. when the user > commits in a separate worktree while the rebase is interrupted, before > continuing the rebase). Exactly. I just wasn't sure if stopped-sha is handled with the same carefulness as the object names in todo, which are expanded after read and shortened before given back to the users. Thanks.
Hi Junio, On Sat, 26 Sep 2020, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > >> Is there something obvious I am not seeing that makes this change a > >> bad idea (other than "somebody may be in the middle of a rebase and > >> all of a sudden, version of Git gets updated to contain this one, > >> which is unable to read abbreviated object name the current version > >> left on disk", which I am deliberately ignoring)? > > > > [...] > > - and most importantly: just like we expand the commit IDs in the todo > > list, we actually want to expand them in `stopped-sha` because it _is_ > > possible that a new object is written that makes the previous > > unambiguously abbreviated object ID now ambiguous (e.g. when the user > > commits in a separate worktree while the rebase is interrupted, before > > continuing the rebase). > > Exactly. I just wasn't sure if stopped-sha is handled with the same > carefulness as the object names in todo, which are expanded after > read and shortened before given back to the users. The main purpose of `stopped-sha` is to let `git rebase --continue` after an `edit` command amend the commit where it stopped _iff_ it is still `HEAD`. So yes, I think we need to be as careful here. Ciao, Dscho
diff --git a/sequencer.c b/sequencer.c index fd7701c88a..7dc9088d09 100644 --- a/sequencer.c +++ b/sequencer.c @@ -120,7 +120,7 @@ static GIT_PATH_FUNC(rebase_path_author_script, "rebase-merge/author-script") static GIT_PATH_FUNC(rebase_path_amend, "rebase-merge/amend") /* * When we stop at a given patch via the "edit" command, this file contains - * the abbreviated commit name of the corresponding patch. + * the commit object name of the corresponding patch. */ static GIT_PATH_FUNC(rebase_path_stopped_sha, "rebase-merge/stopped-sha") /* @@ -3012,11 +3012,12 @@ static int make_patch(struct repository *r, { struct strbuf buf = STRBUF_INIT; struct rev_info log_tree_opt; - const char *subject, *p; + const char *subject; + char hex[GIT_MAX_HEXSZ + 1]; int res = 0; - p = short_commit_name(commit); - if (write_message(p, strlen(p), rebase_path_stopped_sha(), 1) < 0) + oid_to_hex_r(hex, &commit->object.oid); + if (write_message(hex, strlen(hex), rebase_path_stopped_sha(), 1) < 0) return -1; res |= write_rebase_head(&commit->object.oid); @@ -4396,7 +4397,7 @@ int sequencer_continue(struct repository *r, struct replay_opts *opts) if (read_oneliner(&buf, rebase_path_stopped_sha(), READ_ONELINER_SKIP_IF_EMPTY) && - !get_oid_committish(buf.buf, &oid)) + !get_oid_hex(buf.buf, &oid)) record_in_rewritten(&oid, peek_command(&todo_list, 0)); strbuf_release(&buf); }