Message ID | 20240226100803.GC2685600@coredump.intra.peff.net (mailing list archive) |
---|---|
State | Accepted |
Commit | 5edd12672086c6b6d92147925dda5dd3bca2b658 |
Headers | show |
Series | show-branch --reflog fixes | expand |
On Mon, Feb 26, 2024 at 05:08:03AM -0500, Jeff King wrote: > Thus nobody should actually look at the reflog entry info we return. But > we'll still put in some fake values just to be on the safe side, since > this is such a subtle and confusing interface. Likewise, we'll document > what's going on in a comment above the function declaration. If this > were a function with a lot of callers, the footgun would probably not be > worth it. But it has only ever had two callers in its 18-year existence, > and it seems unlikely to grow more. So let's hold our noses and let > users enjoy the convenience of a simulated ref@{0}. Obviously I'm sympathetic to Patrick's position that this empty-reflog special case is kind of gross. ;) That's one of the reasons I split this out from patch 2; we can see exactly what must be done to make each case work. And in fact I had originally started to write a patch that simply changed t1508 to expect failure. I could still be persuaded to go that way if anybody feels strongly. -Peff
Jeff King <peff@peff.net> writes: > if (!cb.reccnt) { > + if (cnt == 0) { Style "if (!cnt)" ? In this particular case I do not think it actually is an improvement, though, simply because zero is really special in this logic. > + /* > + * The caller asked for ref@{0}, and we had no entries. > + * It's a bit subtle, but in practice all callers have > + * prepped the "oid" field with the current value of > + * the ref, which is the most reasonable fallback. > + * > + * We'll put dummy values into the out-parameters (so > + * they're not just uninitialized garbage), and the > + * caller can take our return value as a hint that > + * we did not find any such reflog. > + */ > + set_read_ref_cutoffs(&cb, 0, 0, "empty reflog"); > + return 1; > + } The dummy value I 100% agree with ;-). You mentioned the convenience special case for time-based reflog query for a time older than (e.g. @{20.years.ago}) the reflog itself, and perhaps this one should be treated as its counterpart, that is only useful for count-based access.
Jeff King <peff@peff.net> writes: > That's one of the reasons I split this out from patch 2; we can see > exactly what must be done to make each case work. And in fact I had > originally started to write a patch that simply changed t1508 to expect > failure. I could still be persuaded to go that way if anybody feels > strongly. I do not feel strongly either way myself. It just is interesting that the older end of the history is with @{20.years.ago} special case that is only for time-based query, while the newer end of the history is with @{0} special case.
On Mon, Feb 26, 2024 at 09:25:28AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > if (!cb.reccnt) { > > + if (cnt == 0) { > > Style "if (!cnt)" ? In this particular case I do not think it > actually is an improvement, though, simply because zero is really > special in this logic. Yeah, I didn't really choose "== 0" as a conscious decision. But I do tend to write "!cnt" when available, so I think I sub-consciously chose this as a better description of the intent. > > > + /* > > + * The caller asked for ref@{0}, and we had no entries. > > + * It's a bit subtle, but in practice all callers have > > + * prepped the "oid" field with the current value of > > + * the ref, which is the most reasonable fallback. > > + * > > + * We'll put dummy values into the out-parameters (so > > + * they're not just uninitialized garbage), and the > > + * caller can take our return value as a hint that > > + * we did not find any such reflog. > > + */ > > + set_read_ref_cutoffs(&cb, 0, 0, "empty reflog"); > > + return 1; > > + } > > The dummy value I 100% agree with ;-). > > You mentioned the convenience special case for time-based reflog > query for a time older than (e.g. @{20.years.ago}) the reflog > itself, and perhaps this one should be treated as its counterpart, > that is only useful for count-based access. I think the true counterpart to that would be extending what's added by patch 2 to get_oid_basic() to stop checking that we hit the count exactly. Let me try to lay out my thinking. If you _do_ have a reflog and the request (whether time or count-based) goes too far back, read_ref_at() will give you the oldest entry and return "1". And then in get_oid_basic(): - if it was a time based cutoff (like @{20.years.ago}), we will issue a warning ("log only goes back to 2024-01-01") but return the value anyway. - before this series, if it is a count based cutoff (like @{9999}), we fail and say "there are only have N entries". - after this series, we special case asking for @{9999} when there are 9998 entries by returning the "old" oid but not issuing any warning (we do not need to, because we found the right answer for what it would have been had that old entry not been pruned). - what we _could_ do (but this series does not), and what would be the true counterpart to the @{20.years.ago} case, is to allow @{9999} for a reflog with only 20 entries, returning the old value from 20 (or the new value if it was a creation!?) and issuing a warning saying "well, it only went back 20, but here you go". I'm not so sure about that last one. It is pretty subjective, but somehow asking for timestamps feels more "fuzzy" to me, and Git returning a fuzzy answer is OK. Whereas asking for item 9999 in a list with 20 items and getting back an answer feels more absolutely wrong. I could be persuaded if there were a concrete use case, but I can't really think of one. It seems more likely to confuse and hinder a user than to help them. -Peff
On Mon, Feb 26, 2024 at 09:25:32AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > That's one of the reasons I split this out from patch 2; we can see > > exactly what must be done to make each case work. And in fact I had > > originally started to write a patch that simply changed t1508 to expect > > failure. I could still be persuaded to go that way if anybody feels > > strongly. > > I do not feel strongly either way myself. It just is interesting > that the older end of the history is with @{20.years.ago} special > case that is only for time-based query, while the newer end of the > history is with @{0} special case. I laid out my thinking on the 20.years.ago special case in another reply, but I wanted to say one more thing. The special case here in patch 3 is making @{0} work for the empty reflog, but there is no matching special case for time-based timestamps. If you have an empty reflog and ask for @{20.years.ago}, you will get the usual "nope, the reflog is empty" response (as opposed to having a non-empty reflog that cuts off before 20 years ago). Obviously we could make that work, but I think the point is that "@{0}" is special magic for "the current value" in a way that a timestamp isn't really. -Peff
Jeff King <peff@peff.net> writes: > Let me try to lay out my thinking. If you _do_ have a reflog and the > request (whether time or count-based) goes too far back, read_ref_at() > will give you the oldest entry and return "1". And then in > get_oid_basic(): > ... > - what we _could_ do (but this series does not), and what would be the > true counterpart to the @{20.years.ago} case, is to allow @{9999} > for a reflog with only 20 entries, returning the old value from 20 > (or the new value if it was a creation!?) and issuing a warning > saying "well, it only went back 20, but here you go". Ah, I wasn't drawing *that* similarity. My thinking was more like - When you have two entries in reflog, ref@{0} will use and find the latest entry whose value is the same as the ref itself. - When you have one entry, @{0} will use and find the latest entry whose value is the same as the ref itself. - When you have zero entry, @{0} can do the same by taking advantage of the fact that its value is supposed to be the same as the ref itself anyway. that happens near the youngest end of a reflog, contrasting with the @{20.years.ago} that happens near the oldest end. > I'm not so sure about that last one. It is pretty subjective, but > somehow asking for timestamps feels more "fuzzy" to me, and Git > returning a fuzzy answer is OK. Whereas asking for item 9999 in a list > with 20 items and getting back an answer feels more absolutely wrong. I > could be persuaded if there were a concrete use case, but I can't really > think of one. It seems more likely to confuse and hinder a user than to > help them. I do not think anybody misses @{9999} not giving the oldest available, simply because "oldest" is a concept that fits better with time-based queries than count-based queries.
diff --git a/refs.c b/refs.c index 6b826b002e..8eaec5eca7 100644 --- a/refs.c +++ b/refs.c @@ -1109,6 +1109,21 @@ int read_ref_at(struct ref_store *refs, const char *refname, refs_for_each_reflog_ent_reverse(refs, refname, read_ref_at_ent, &cb); if (!cb.reccnt) { + if (cnt == 0) { + /* + * The caller asked for ref@{0}, and we had no entries. + * It's a bit subtle, but in practice all callers have + * prepped the "oid" field with the current value of + * the ref, which is the most reasonable fallback. + * + * We'll put dummy values into the out-parameters (so + * they're not just uninitialized garbage), and the + * caller can take our return value as a hint that + * we did not find any such reflog. + */ + set_read_ref_cutoffs(&cb, 0, 0, "empty reflog"); + return 1; + } if (flags & GET_OID_QUIETLY) exit(128); else diff --git a/refs.h b/refs.h index 303c5fac4d..37116ad2b2 100644 --- a/refs.h +++ b/refs.h @@ -440,7 +440,20 @@ int refs_create_reflog(struct ref_store *refs, const char *refname, struct strbuf *err); int safe_create_reflog(const char *refname, struct strbuf *err); -/** Reads log for the value of ref during at_time. **/ +/** + * Reads log for the value of ref during at_time (in which case "cnt" should be + * negative) or the reflog "cnt" entries from the top (in which case "at_time" + * should be 0). + * + * If we found the reflog entry in question, returns 0 (and details of the + * entry can be found in the out-parameters). + * + * If we ran out of reflog entries, the out-parameters are filled with the + * details of the oldest entry we did find, and the function returns 1. Note + * that there is one important special case here! If the reflog was empty + * and the caller asked for the 0-th cnt, we will return "1" but leave the + * "oid" field untouched. + **/ int read_ref_at(struct ref_store *refs, const char *refname, unsigned int flags, timestamp_t at_time, int cnt, diff --git a/t/t1508-at-combinations.sh b/t/t1508-at-combinations.sh index 370bf7137e..e841309d0e 100755 --- a/t/t1508-at-combinations.sh +++ b/t/t1508-at-combinations.sh @@ -110,7 +110,7 @@ test_expect_success '@{1} works with only one reflog entry' ' test_cmp_rev newbranch~ newbranch@{1} ' -test_expect_failure '@{0} works with empty reflog' ' +test_expect_success '@{0} works with empty reflog' ' git checkout -B newbranch main && git reflog expire --expire=now refs/heads/newbranch && test_cmp_rev newbranch newbranch@{0} diff --git a/t/t3202-show-branch.sh b/t/t3202-show-branch.sh index 35f35f8091..a1139f79e2 100755 --- a/t/t3202-show-branch.sh +++ b/t/t3202-show-branch.sh @@ -279,8 +279,8 @@ test_expect_success '--reflog shows reflog entries' ' test_expect_success '--reflog handles missing reflog' ' git reflog expire --expire=now branch && - test_must_fail git show-branch --reflog branch 2>err && - grep "log .* is empty" err + git show-branch --reflog branch >actual && + test_must_be_empty actual ' test_done
The previous commit special-cased get_oid_basic()'s handling of ref@{n} for a reflog with n entries. But its special case doesn't work for ref@{0} in an empty reflog, because read_ref_at() dies when it notices the empty reflog! We can make this work by special-casing this in read_ref_at(). It's somewhat gross, for two reasons: 1. We have no reflog entry to describe in the "msg" out-parameter. So we have to leave it uninitialized or make something up. 2. Likewise, we have no oid to put in the "oid" out-parameter. Leaving it untouched is actually the best thing here, as all of the callers will have initialized it with the current ref value via repo_dwim_log(). This is rather subtle, but it is how things worked in 6436a20284 (refs: allow @{n} to work with n-sized reflog, 2021-01-07) before we reverted it. The key difference from 6436a20284 here is that we'll return "1" to indicate that we _didn't_ find the requested reflog entry. Coupled with the special-casing in get_oid_basic() in the previous commit, that's enough to make looking up ref@{0} work, and we can flip 6436a20284's test back to expect_success. It also means that the call in show-branch which segfaulted with 6436a20284 (and which is now tested in t3202) remains OK. The caller notices that we could not find any reflog entry, and so it breaks out of its loop, showing nothing. This is different from the current behavior of producing an error, but it's just as reasonable (and is exactly what we'd do if you asked it to walk starting at ref@{1} but there was only 1 entry). Thus nobody should actually look at the reflog entry info we return. But we'll still put in some fake values just to be on the safe side, since this is such a subtle and confusing interface. Likewise, we'll document what's going on in a comment above the function declaration. If this were a function with a lot of callers, the footgun would probably not be worth it. But it has only ever had two callers in its 18-year existence, and it seems unlikely to grow more. So let's hold our noses and let users enjoy the convenience of a simulated ref@{0}. Signed-off-by: Jeff King <peff@peff.net> --- refs.c | 15 +++++++++++++++ refs.h | 15 ++++++++++++++- t/t1508-at-combinations.sh | 2 +- t/t3202-show-branch.sh | 4 ++-- 4 files changed, 32 insertions(+), 4 deletions(-)