From patchwork Mon Feb 17 15:27:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13978051 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCCF022257C for ; Mon, 17 Feb 2025 15:27:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739806075; cv=none; b=iw+KsNuoguY+WQGI6MEFzlQW/yO4KBB+XA5I7UO+wAOZC2/Vh3waHEnrHJg0JeoDJGu/0IO9WeR3pCR1PzqaW+uOAhrXOnulHF6bWhfy2yU6ET8KNintWhveNJrJKTX8W0aWJM1YSTeqLrslTtpXzoQhyKPiefQtXIqt0OIsjtc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739806075; c=relaxed/simple; bh=C8qRdmayenu7okl3xZoc4Vj0HeENEsEZ/GiXMSrlQEU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=tZX4NSO4pNeXrmBQ6wJ5epwXs72SPJN3e9AjJEWEkqM0vQsadd7ZZNN7jMouW9itUcWnNN7oWeApOj5QKEr0g4wHXTM/LQE95WrEAc+er/qoYt4J6yr7gAVMBxk7Nqzh8Ir/bT+Wb2RA6zFiIo3RaqnqBUpwDwFAjX3YjXYUcY4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=J7diFVux; arc=none smtp.client-ip=209.85.216.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="J7diFVux" Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-2f833af7a09so6465184a91.2 for ; Mon, 17 Feb 2025 07:27:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739806073; x=1740410873; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=M4Zy7qzXJPdxt56h6/1I1oTtrN3fD/oFXriouk8XL8M=; b=J7diFVuxob1b1niCJsdySpc5MXoc/X9TabpNerHM7+2/kQ6SVsCR9enT4dEQ/gTL0D 1ICsEZNXCsjZ3mYxdSfW9HEJdXKI6uPhX/WrK+zO0v2XeZ8JkR3TBozVnbhOrQED4mni 0mFBJi3yeL3Zs1oX/gzr5VJpDK9Xw1FVsPV2Oht6k5Nr1AoBw/U8OSBUIlj6V/wPZC1W vb1gyvGo1gZSL7D+OVEaB/IuwjOPBlirVmdUsEMuxn+5IaB1v5SB7mBya7dxWlGSJzy1 XeOX2YMwySLXjWME1bPnGV3JdrF7EdJgvSkcXTM6tjhUDhpma3LgCjTZ+DxB+FMxByGU ichQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739806073; x=1740410873; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=M4Zy7qzXJPdxt56h6/1I1oTtrN3fD/oFXriouk8XL8M=; b=kotYbq1O9GY/mfbYpPanq+kzVSl7xqzuzEfxwJR7ycQzNm3qOVCZXrcyd/hg1aexgS HEaMyRsFHeLQEUEpgnF5f5REvGhWpLmimtUlX0sy20jWaiAm+lDh26U2SkLxHVNpqkqN 41eSJrESePLXfezmM2cIO9sNYZ4j1x+eDSieTOA4O52bGcokCoZQmwWeMAlxjgAoJ6F+ MXpKSoZaooGiv3WWuMxnv3fsV/ccM4F546iRIv4tOhLghhcupXDSjERWtqlzH2+Pu6E0 /QRHmAT1PvROecgJYE40G5ahz/xuxIqm4L4kvRo3Dw0sMOguxM8U5+sFLVhmoNPA/HXh 4g/Q== X-Gm-Message-State: AOJu0YyubrkErsnQCgvhSVO8OYfGk9dsTB3ecdGE1gia5llS88zytzdg Zlzo8M4rHUekwih8Pi6FUYWJ+JZWF0BfBB3LwAmzaBAG5Ixo5r3wDlhtjw== X-Gm-Gg: ASbGncu6P+a+SKz+3Rcz2m4uU7pPq53CRiXxlGjxCroY1YS/NKrXd0+9bp9OYWybnMG StU4Ot3lDq8HzTMm8KaPLkfHLTWLZbc9NWMOTRocSrZfHa9j4P6T3m1XDR4amSYBCDdIXCMCv8a j3J9x6+GWVTzrZ3MAZNi24u9I6PDYaYuqnhLaAXY6MfmCIhKHGQYrozYTS7Gh9V2e+j81wLRCC4 nh7nfT9PKFnyeAF+s3+EgvQ+/uoq5FP7OWA079FNLmXe0uqYpYQCSR87dq3g1GvsJPBsSZkqlNI ENsQ1a07UjU= X-Google-Smtp-Source: AGHT+IEl/yRtblWb0R+QRGhDI8X1bEb7LOopK/Mgsr+V7GDYncwIG+zBC868kaTKrREVk/EZ5a4tmw== X-Received: by 2002:a17:90b:1b0d:b0:2ee:c9b6:4c42 with SMTP id 98e67ed59e1d1-2fc40f22cbamr17713770a91.16.1739806072506; Mon, 17 Feb 2025 07:27:52 -0800 (PST) Received: from localhost ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with UTF8SMTPSA id 98e67ed59e1d1-2fc13ad2f20sm8218485a91.23.2025.02.17.07.27.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Feb 2025 07:27:52 -0800 (PST) Date: Mon, 17 Feb 2025 23:27:50 +0800 From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Michael Haggerty Subject: [PATCH v5 4/8] packed-backend: add "packed-refs" header consistency check Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In "packed-backend.c::create_snapshot", if there is a header (the line which starts with '#'), we will check whether the line starts with "# pack-refs with:". Before we port this check into "packed_fsck", let's fix "create_snapshot" to check the prefix "# packed-ref with: " instead of "# packed-ref with:" due to that we will always write a single trailing space after the colon. However, we need to consider other situations and discuss whether we need to add checks. 1. If the header does not exist, we should not report an error to the user. This is because in older Git version, we never write header in the "packed-refs" file. Also, we do allow no header in "packed-refs" in runtime. 2. If the header content does not start with "# packed-ref with: ", we should report an error just like what "create_snapshot" does. So, create a new fsck message "badPackedRefHeader(ERROR)" for this. 3. If the header content is not the same as the constant string "PACKED_REFS_HEADER". This is expected because we make it extensible intentionally and runtime "create_snapshot" won't complain about unknown traits. In order to align with the runtime behavior. There is no need to report. As we have analyzed, we only need to check the case 2 in the above. In order to do this, use "open_nofollow" function to get the file descriptor and then read the "packed-refs" file via "strbuf_read". Like what "create_snapshot" and other functions do, we could split the line by finding the next newline in the buffer. When we cannot find a newline, we could report an error. So, create a function "packed_fsck_ref_next_line" to find the next newline and if there is no such newline, use "packedRefEntryNotTerminated(ERROR)" to report an error to the user. Then, parse the first line to apply the checks. Update the test to exercise the code. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- Documentation/fsck-msgids.adoc | 8 +++ fsck.h | 2 + refs/packed-backend.c | 96 +++++++++++++++++++++++++++++++++- t/t0602-reffiles-fsck.sh | 52 ++++++++++++++++++ 4 files changed, 157 insertions(+), 1 deletion(-) diff --git a/Documentation/fsck-msgids.adoc b/Documentation/fsck-msgids.adoc index b14bc44ca4..11906f90fd 100644 --- a/Documentation/fsck-msgids.adoc +++ b/Documentation/fsck-msgids.adoc @@ -16,6 +16,10 @@ `badObjectSha1`:: (ERROR) An object has a bad sha1. +`badPackedRefHeader`:: + (ERROR) The "packed-refs" file contains an invalid + header. + `badParentSha1`:: (ERROR) A commit object has a bad parent sha1. @@ -176,6 +180,10 @@ `nullSha1`:: (WARN) Tree contains entries pointing to a null sha1. +`packedRefEntryNotTerminated`:: + (ERROR) The "packed-refs" file contains an entry that is + not terminated by a newline. + `refMissingNewline`:: (INFO) A loose ref that does not end with newline(LF). As valid implementations of Git never created such a loose ref diff --git a/fsck.h b/fsck.h index a44c231a5f..67e3c97bc0 100644 --- a/fsck.h +++ b/fsck.h @@ -30,6 +30,7 @@ enum fsck_msg_type { FUNC(BAD_EMAIL, ERROR) \ FUNC(BAD_NAME, ERROR) \ FUNC(BAD_OBJECT_SHA1, ERROR) \ + FUNC(BAD_PACKED_REF_HEADER, ERROR) \ FUNC(BAD_PARENT_SHA1, ERROR) \ FUNC(BAD_REF_CONTENT, ERROR) \ FUNC(BAD_REF_FILETYPE, ERROR) \ @@ -53,6 +54,7 @@ enum fsck_msg_type { FUNC(MISSING_TYPE, ERROR) \ FUNC(MISSING_TYPE_ENTRY, ERROR) \ FUNC(MULTIPLE_AUTHORS, ERROR) \ + FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \ FUNC(TREE_NOT_SORTED, ERROR) \ FUNC(UNKNOWN_TYPE, ERROR) \ FUNC(ZERO_PADDED_DATE, ERROR) \ diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 8140a31d07..09eb3886c3 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs) tmp = xmemdupz(snapshot->buf, eol - snapshot->buf); - if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p)) + if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p)) die_invalid_line(refs->path, snapshot->buf, snapshot->eof - snapshot->buf); @@ -1749,13 +1749,78 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s return empty_ref_iterator_begin(); } +static int packed_fsck_ref_next_line(struct fsck_options *o, + unsigned long line_number, const char *start, + const char *eof, const char **eol) +{ + int ret = 0; + + *eol = memchr(start, '\n', eof - start); + if (!*eol) { + struct strbuf packed_entry = STRBUF_INIT; + struct fsck_ref_report report = { 0 }; + + strbuf_addf(&packed_entry, "packed-refs line %lu", line_number); + report.path = packed_entry.buf; + ret = fsck_report_ref(o, &report, + FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED, + "'%.*s' is not terminated with a newline", + (int)(eof - start), start); + + /* + * There is no newline but we still want to parse it to the end of + * the buffer. + */ + *eol = eof; + strbuf_release(&packed_entry); + } + + return ret; +} + +static int packed_fsck_ref_header(struct fsck_options *o, + const char *start, const char *eol) +{ + if (!starts_with(start, "# pack-refs with: ")) { + struct fsck_ref_report report = { 0 }; + report.path = "packed-refs.header"; + + return fsck_report_ref(o, &report, + FSCK_MSG_BAD_PACKED_REF_HEADER, + "'%.*s' does not start with '# pack-refs with: '", + (int)(eol - start), start); + } + + return 0; +} + +static int packed_fsck_ref_content(struct fsck_options *o, + const char *start, const char *eof) +{ + unsigned long line_number = 1; + const char *eol; + int ret = 0; + + ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol); + if (*start == '#') { + ret |= packed_fsck_ref_header(o, start, eol); + + start = eol + 1; + line_number++; + } + + return ret; +} + static int packed_fsck(struct ref_store *ref_store, struct fsck_options *o, struct worktree *wt) { struct packed_ref_store *refs = packed_downcast(ref_store, REF_STORE_READ, "fsck"); + struct strbuf packed_ref_content = STRBUF_INIT; struct stat st; + int fd; int ret = 0; if (!is_main_worktree(wt)) @@ -1784,7 +1849,36 @@ static int packed_fsck(struct ref_store *ref_store, goto cleanup; } + /* + * There is a chance that "packed-refs" file is removed or converted to + * a symlink after filetype check and before open. So we need to avoid + * this race condition by opening the file. + */ + fd = open_nofollow(refs->path, O_RDONLY); + if (fd < 0) { + if (errno == ENOENT) + goto cleanup; + + if (errno == ELOOP) { + struct fsck_ref_report report = { 0 }; + report.path = "packed-refs"; + ret = fsck_report_ref(o, &report, + FSCK_MSG_BAD_REF_FILETYPE, + "not a regular file"); + goto cleanup; + } + } + + if (strbuf_read(&packed_ref_content, fd, 0) < 0) { + ret = error_errno(_("unable to read %s"), refs->path); + goto cleanup; + } + + ret = packed_fsck_ref_content(o, packed_ref_content.buf, + packed_ref_content.buf + packed_ref_content.len); + cleanup: + strbuf_release(&packed_ref_content); return ret; } diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh index e65ca341cd..e055c36e74 100755 --- a/t/t0602-reffiles-fsck.sh +++ b/t/t0602-reffiles-fsck.sh @@ -639,4 +639,56 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' ) ' +test_expect_success 'packed-refs header should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit default && + + git refs verify 2>err && + test_must_be_empty err && + + for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \ + "# pack-refs with traits: peeled fully-peeled sorted " \ + "# pack-refs with a: peeled fully-peeled" \ + "# pack-refs with:peeled fully-peeled sorted" + do + printf "%s\n" "$bad_header" >.git/packed-refs && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\'' + EOF + rm .git/packed-refs && + test_cmp expect err || return 1 + done + ) +' + +test_expect_success 'packed-refs missing header should not be reported' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit default && + + printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs && + git refs verify 2>err && + test_must_be_empty err + ) +' + +test_expect_success 'packed-refs unknown traits should not be reported' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit default && + + printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs && + git refs verify 2>err && + test_must_be_empty err + ) +' + test_done