From patchwork Mon Dec 12 22:48:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13071502 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7655DC00145 for ; Mon, 12 Dec 2022 22:50:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233813AbiLLWuT (ORCPT ); Mon, 12 Dec 2022 17:50:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232817AbiLLWt4 (ORCPT ); Mon, 12 Dec 2022 17:49:56 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34576E14 for ; Mon, 12 Dec 2022 14:48:58 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id n16-20020a056a000d5000b005764608bb24so767562pfv.12 for ; Mon, 12 Dec 2022 14:48:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lwKbvaXU+L0SbOjSLYra6EM27mGx94iI5RwBlo+OQpM=; b=r9mTHvtQ2C9ysTlzJ0PZe4o80TAN5Jpc+lTa8d6ZEWCQRrsi3T6h1AFrZNs2N8QgCs JVvpbDdOQeupVcgV12BA0fpZ92MfiM+IaHSQNMnZ85aeuIrMFq+3S9guMbhAdJOQqSA9 N1Djc2fHOnOy6cCWVh23NKY1dNw/O7xedcmd9jjjI3z4jTrY/HVkwOGkewIaWGD+PJMo /+IKdqYyRiNcO/PUcISRJVGQFnbkCuANjRUah/+Y9xJxfpZOUdw1qANMyVqfa3AWCRPC 3JsMeBTi/K1bp4vTdJ8xbQvjMnccac8LtVgavuxZH6kSZTV08JPY+ktuVEIlHKlOoOk6 Iq+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lwKbvaXU+L0SbOjSLYra6EM27mGx94iI5RwBlo+OQpM=; b=UXUaRQW219YryTCa3l4thKoxWeEuyuR85ZQZ1LGbYD7D+r2hQv7DLSpqnYiKzdamIO PAaFlc5YIlpaPSXtCnCOQIIpCYCdYJDUCGN7g6UrHlyYSsjUOaWynymX3LTvg+gEWgGw sikI0rMkAkCY7RFoD9g8+NfTegcpcaSWdLbakDad0DcDi3UXSJW3wpdut0orheQdsF2n XgpvuSb7OcwKmEveLIBJX4G+E7oyC4MaBHRrGjzdfT/KIpO2gH7eE4xbDL981Kn64raN ukCufmNdz19vXA2QYiRBsm/69nZzy9h7tW8E0zZ2HANBftJYzl1Au07RdrpkDiKCG5+u 7/Ag== X-Gm-Message-State: ANoB5plTbmteiMYetnelFVDOWtvTlg0gAvaAGqu5Q0kh6vKaJqn0/JqE pbGdZFA5CsHoeYFJmSZvoioqHKX65l7TItRfpB0KNpmmAfhKNX1VkEdpG6sDlJmPi0HxswC/DJn Ucg5cWRegwJ1DJ5hObkdP+9f1TqzPp6oFHjsBSjPvyR5M6r+tI3JFSWw4/mHS2cJR96CopIf6yf zK X-Google-Smtp-Source: AA0mqf5gVdJEucMREmT/PJFNxotrcOnuDpsqWWdaB4af3QrV1RqIAKDkLVA0nmyNSHuUIGa/leflf1GThCfs9ibHXXhe X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a17:902:da88:b0:189:967b:97a5 with SMTP id j8-20020a170902da8800b00189967b97a5mr49111707plx.40.1670885337527; Mon, 12 Dec 2022 14:48:57 -0800 (PST) Date: Mon, 12 Dec 2022 14:48:48 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: Subject: [PATCH v5 1/4] object-file: remove OBJECT_INFO_IGNORE_LOOSE From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Its last user was removed in 97b2fa08b6 (fetch-pack: drop custom loose object cache, 2018-11-12), so we can remove it. Helped-by: Jeff King Signed-off-by: Jonathan Tan --- object-file.c | 3 --- object-store.h | 4 +--- 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/object-file.c b/object-file.c index 26290554bb..cf724bc19b 100644 --- a/object-file.c +++ b/object-file.c @@ -1575,9 +1575,6 @@ static int do_oid_object_info_extended(struct repository *r, if (find_pack_entry(r, real, &e)) break; - if (flags & OBJECT_INFO_IGNORE_LOOSE) - return -1; - /* Most likely it's a loose object. */ if (!loose_object_info(r, real, oi, flags)) return 0; diff --git a/object-store.h b/object-store.h index 1be57abaf1..b1ec0bde82 100644 --- a/object-store.h +++ b/object-store.h @@ -434,13 +434,11 @@ struct object_info { #define OBJECT_INFO_ALLOW_UNKNOWN_TYPE 2 /* Do not retry packed storage after checking packed and loose storage */ #define OBJECT_INFO_QUICK 8 -/* Do not check loose object */ -#define OBJECT_INFO_IGNORE_LOOSE 16 /* * Do not attempt to fetch the object if missing (even if fetch_is_missing is * nonzero). */ -#define OBJECT_INFO_SKIP_FETCH_OBJECT 32 +#define OBJECT_INFO_SKIP_FETCH_OBJECT 16 /* * This is meant for bulk prefetching of missing blobs in a partial * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK From patchwork Mon Dec 12 22:48:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13071503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67349C4332F for ; Mon, 12 Dec 2022 22:50:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233875AbiLLWuW (ORCPT ); Mon, 12 Dec 2022 17:50:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234005AbiLLWt5 (ORCPT ); Mon, 12 Dec 2022 17:49:57 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F62F6261 for ; Mon, 12 Dec 2022 14:49:00 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-360b9418f64so146877027b3.7 for ; Mon, 12 Dec 2022 14:49:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IC3BUgof840lhumwoRwCw8PH+u3UlMHZYmpnyBupWOU=; b=V7EqfOC+zLjtEk0G4pA3K5bAxxC2O+/v4fM4FUUjw2fTSCxobIHGnONL6juE9QWgHn 3QSpefGevD6vsMAjeWal83F417ogEWLagwBlYVezulb2eqOlfVGr5mjsoDvovCGqXxm1 +AROiYtVDjiOaoXdvBJm6V4BU1/d1952DFYG4ZeBYkSihY5jm55Dd3s2cfLVjhRrcmvA KkYjmDqc7t8158G04fkyhtUauEZ2qp6mECbFyAFnul7+UH70w5897Y63qVVMbLmfXDYC QPO7QPISFrwFG4lJFe/MWYB9j1QNM12B57GQyB/XNlDZXULkH3bW3dAoHpzOuLLkWHt2 ocOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IC3BUgof840lhumwoRwCw8PH+u3UlMHZYmpnyBupWOU=; b=V4u96DHNXaZ1MwxfUVGKT6JAc8+A2VNigkyi0Ojul5RSGd61DNHmJB4ETkHTXfKFic dmSuPs+fvptDda1oV176OyNo4px2GsdVzT8SNbedqhttss6xAyQCyjgwTeykVUEN7LxN 76xLH+tdRJ7fNEs+gJAMN1d7VPxn5RbjkbCptD7VKSp7ordjBKWFGBtSyS7agVyjLdPH xusg8x4RGecw/odc8xOwYod4H+bUXr7oq2i3bQ5PqG2K+Xp3XvzCV8Y6gyVGqWx/PlJh fQxaBjlQ23mvmJbyY+3vtkqXmybXkNmsrD18Ewshtc7Fq4bDxIihFKk8xRPnzqV6tnw+ BqKA== X-Gm-Message-State: ANoB5pl5o91G16vHSKGwcQifbOvveiHWX/BFdmDdFaGMO+4E4joBir73 zxU3wF1C762avSNUa0TH4ctjC0KbMz2d06uAkAoISFzSyJgq9x7tk6OFS5i3s3m6P3rE72DQX0a xWFG2cNoyHVZxNoAio/xx60TnxNktJAr5US7pmAnqvkcfk07iKbZBLTRDsh9lD+1MS8PptxsC8Y gf X-Google-Smtp-Source: AA0mqf43pdWUJ6v4bW3DcUZb4r3xNMcQs5CijKvp/Hlgb3ZG68N4/dnsJKITe5/SCbn+bCotVwoYd9zZCDy4c5nBECTa X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a81:e51:0:b0:3df:21db:24f3 with SMTP id 78-20020a810e51000000b003df21db24f3mr28168805ywo.25.1670885339394; Mon, 12 Dec 2022 14:48:59 -0800 (PST) Date: Mon, 12 Dec 2022 14:48:49 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <4b2fb687432c2ce1471d9eb02e86b3acc43cc953.1670885252.git.jonathantanmy@google.com> Subject: [PATCH v5 2/4] object-file: refactor map_loose_object_1() From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This function can do 3 things: 1. Gets an fd given a path 2. Simultaneously gets a path and fd given an OID 3. Memory maps an fd Keep 3 (renaming the function accordingly) and inline 1 and 2 into their respective callers. Signed-off-by: Jonathan Tan --- object-file.c | 50 ++++++++++++++++++++++++-------------------------- 1 file changed, 24 insertions(+), 26 deletions(-) diff --git a/object-file.c b/object-file.c index cf724bc19b..429e3a746d 100644 --- a/object-file.c +++ b/object-file.c @@ -1211,35 +1211,25 @@ static int quick_has_loose(struct repository *r, } /* - * Map the loose object at "path" if it is not NULL, or the path found by - * searching for a loose object named "oid". + * Map and close the given loose object fd. The path argument is used for + * error reporting. */ -static void *map_loose_object_1(struct repository *r, const char *path, - const struct object_id *oid, unsigned long *size) +static void *map_fd(int fd, const char *path, unsigned long *size) { - void *map; - int fd; - - if (path) - fd = git_open(path); - else - fd = open_loose_object(r, oid, &path); - map = NULL; - if (fd >= 0) { - struct stat st; + void *map = NULL; + struct stat st; - if (!fstat(fd, &st)) { - *size = xsize_t(st.st_size); - if (!*size) { - /* mmap() is forbidden on empty files */ - error(_("object file %s is empty"), path); - close(fd); - return NULL; - } - map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0); + if (!fstat(fd, &st)) { + *size = xsize_t(st.st_size); + if (!*size) { + /* mmap() is forbidden on empty files */ + error(_("object file %s is empty"), path); + close(fd); + return NULL; } - close(fd); + map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0); } + close(fd); return map; } @@ -1247,7 +1237,12 @@ void *map_loose_object(struct repository *r, const struct object_id *oid, unsigned long *size) { - return map_loose_object_1(r, NULL, oid, size); + const char *p; + int fd = open_loose_object(r, oid, &p); + + if (fd < 0) + return NULL; + return map_fd(fd, p, size); } enum unpack_loose_header_result unpack_loose_header(git_zstream *stream, @@ -2789,13 +2784,16 @@ int read_loose_object(const char *path, struct object_info *oi) { int ret = -1; + int fd; void *map = NULL; unsigned long mapsize; git_zstream stream; char hdr[MAX_HEADER_LEN]; unsigned long *size = oi->sizep; - map = map_loose_object_1(the_repository, path, NULL, &mapsize); + fd = git_open(path); + if (fd >= 0) + map = map_fd(fd, path, &mapsize); if (!map) { error_errno(_("unable to mmap %s"), path); goto out; From patchwork Mon Dec 12 22:48:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13071505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA0BBC4167B for ; Mon, 12 Dec 2022 22:50:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234018AbiLLWu3 (ORCPT ); Mon, 12 Dec 2022 17:50:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233974AbiLLWt5 (ORCPT ); Mon, 12 Dec 2022 17:49:57 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 703E962FC for ; Mon, 12 Dec 2022 14:49:01 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id jc4-20020a17090325c400b00189ceee4049so11514054plb.3 for ; Mon, 12 Dec 2022 14:49:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8UyQxuAyk0Q7Qsb6ZvRnOyLBvhImman2xnRx4Y9Dk7E=; b=gyrhvciJ/yNHGlGPdrhzR4ix52P/1QNI8YarAN3YgJC6oUacIQ6hslIbn8tSMovtTd uL98WtnxYls65uEl0jWLzM4H8eCYXcNYCjNMEF9jvWVdWCVdD3su2rYZyafiFDt98f+v VJCg/TiW3l8JXGJImkhfygAMaOizktND5nbQUrKyvtpdhzmP/k2IYO2Dwn5VvhnDYhGJ heaYhp4yQyhd+sr7VVSRmAiLbfFMTHmy4XVLl6z61kzXe8C/Phi331XM1+YS/Tm5UKY7 4pn3YQvgalPssF57q5mMAvgR123w0AheS+O3K4Wpv3DBXIGtnH7j+m8kCtJU7KH88mhG FTEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8UyQxuAyk0Q7Qsb6ZvRnOyLBvhImman2xnRx4Y9Dk7E=; b=fHmx+8nYRuoX6QfTR1KENy/YQUEDNNrdeY4Mv1PS3BVHECNn3N7hpgzn3PG2IkRC0H 1KrpyZmuaAvfjWvZCzUKckFkSCNDDaNtY/nN+1SwT5ZUr7vVn54nPlQSFj2KaAiIjwRO XllsObor2jeODl33n4NseH46ZZGydaOyR2B3rF4Gbn5soSMawB03mVgKz5tlvLNp9oVY b4i7+xeXUyO+1rh95u0OeyChls0vssNI4bxVfeQ4LocYiD9cPk8O4dt0xRlwRdOSj/A5 lHnsrFBgnj3C8Iprch/Z3h117X7/UsJFyRBPxDO076N3B0J+ITlcu0RkLPqUzQou12HP FI8g== X-Gm-Message-State: ANoB5pkR26lu5jl3Fm/4gzE3ddbJpTLc/XBiJE/uofWC8z/qfLIAvW+S /60SADp+WOO57jaGqA/yQbQe/HnXja+LWI5DJXMEBRJlRb1vmpyKpTZQ3azGOMoyZn67v9vLHp3 mNgJoOCh7OLeWDPL/Z9Y6u4Y32DWufWvd3k4VfxCzkyypPcozW/fOUc42fd7q+ca4vh/ishaoIU 9V X-Google-Smtp-Source: AA0mqf6ru2rrOT+cjIteTTBae4EN0i82w9NzlHfHOFne2xl8c0exEWCH6/9ie+owmJ60Ay5+jS5FZ1vN2r8Q4lR826rI X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:aa7:9629:0:b0:576:8cdd:3f26 with SMTP id r9-20020aa79629000000b005768cdd3f26mr26591318pfg.59.1670885340811; Mon, 12 Dec 2022 14:49:00 -0800 (PST) Date: Mon, 12 Dec 2022 14:48:50 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: Subject: [PATCH v5 3/4] object-file: emit corruption errors when detected From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Instead of relying on errno being preserved across function calls, teach do_oid_object_info_extended() to itself report object corruption when it first detects it. There are 3 types of corruption being detected: - when a replacement object is missing - when a loose object is corrupt - when a packed object is corrupt and the object cannot be read in another way Note that in the RHS of this patch's diff, a check for ENOENT that was introduced in 3ba7a06552 (A loose object is not corrupt if it cannot be read due to EMFILE, 2010-10-28) is also removed. The purpose of this check is to avoid a false report of corruption if the errno contains something like EMFILE (or anything that is not ENOENT), in which case a more generic report is presented. Because, as of this patch, we no longer rely on such a heuristic to determine corruption, but surface the error message at the point when we read something that we did not expect, this check is no longer necessary. Besides being more resilient, this also prepares for a future patch in which an indirect caller of do_oid_object_info_extended() will need such functionality. Helped-by: Jeff King Signed-off-by: Jonathan Tan --- object-file.c | 55 +++++++++++++++++++++++++------------------------- object-store.h | 3 +++ 2 files changed, 31 insertions(+), 27 deletions(-) diff --git a/object-file.c b/object-file.c index 429e3a746d..e0cef8b906 100644 --- a/object-file.c +++ b/object-file.c @@ -1422,7 +1422,9 @@ static int loose_object_info(struct repository *r, struct object_info *oi, int flags) { int status = 0; + int fd; unsigned long mapsize; + const char *path = NULL; void *map; git_zstream stream; char hdr[MAX_HEADER_LEN]; @@ -1443,7 +1445,6 @@ static int loose_object_info(struct repository *r, * object even exists. */ if (!oi->typep && !oi->type_name && !oi->sizep && !oi->contentp) { - const char *path; struct stat st; if (!oi->disk_sizep && (flags & OBJECT_INFO_QUICK)) return quick_has_loose(r, oid) ? 0 : -1; @@ -1454,7 +1455,13 @@ static int loose_object_info(struct repository *r, return 0; } - map = map_loose_object(r, oid, &mapsize); + fd = open_loose_object(r, oid, &path); + if (fd < 0) { + if (errno != ENOENT) + error_errno(_("unable to open loose object %s"), oid_to_hex(oid)); + return -1; + } + map = map_fd(fd, path, &mapsize); if (!map) return -1; @@ -1492,6 +1499,10 @@ static int loose_object_info(struct repository *r, break; } + if (status && path && (flags & OBJECT_INFO_DIE_IF_CORRUPT)) + die(_("loose object %s (stored in %s) is corrupt"), + oid_to_hex(oid), path); + git_inflate_end(&stream); cleanup: munmap(map, mapsize); @@ -1601,6 +1612,15 @@ static int do_oid_object_info_extended(struct repository *r, continue; } + if (flags & OBJECT_INFO_DIE_IF_CORRUPT) { + const struct packed_git *p; + if ((flags & OBJECT_INFO_LOOKUP_REPLACE) && !oideq(real, oid)) + die(_("replacement %s not found for %s"), + oid_to_hex(real), oid_to_hex(oid)); + if ((p = has_packed_and_bad(r, real))) + die(_("packed object %s (stored in %s) is corrupt"), + oid_to_hex(real), p->pack_name); + } return -1; } @@ -1653,7 +1673,8 @@ int oid_object_info(struct repository *r, static void *read_object(struct repository *r, const struct object_id *oid, enum object_type *type, - unsigned long *size) + unsigned long *size, + int die_if_corrupt) { struct object_info oi = OBJECT_INFO_INIT; void *content; @@ -1661,7 +1682,8 @@ static void *read_object(struct repository *r, oi.sizep = size; oi.contentp = &content; - if (oid_object_info_extended(r, oid, &oi, 0) < 0) + if (oid_object_info_extended(r, oid, &oi, die_if_corrupt + ? OBJECT_INFO_DIE_IF_CORRUPT : 0) < 0) return NULL; return content; } @@ -1697,35 +1719,14 @@ void *read_object_file_extended(struct repository *r, int lookup_replace) { void *data; - const struct packed_git *p; - const char *path; - struct stat st; const struct object_id *repl = lookup_replace ? lookup_replace_object(r, oid) : oid; errno = 0; - data = read_object(r, repl, type, size); + data = read_object(r, repl, type, size, 1); if (data) return data; - obj_read_lock(); - if (errno && errno != ENOENT) - die_errno(_("failed to read object %s"), oid_to_hex(oid)); - - /* die if we replaced an object with one that does not exist */ - if (repl != oid) - die(_("replacement %s not found for %s"), - oid_to_hex(repl), oid_to_hex(oid)); - - if (!stat_loose_object(r, repl, &st, &path)) - die(_("loose object %s (stored in %s) is corrupt"), - oid_to_hex(repl), path); - - if ((p = has_packed_and_bad(r, repl))) - die(_("packed object %s (stored in %s) is corrupt"), - oid_to_hex(repl), p->pack_name); - obj_read_unlock(); - return NULL; } @@ -2268,7 +2269,7 @@ int force_object_loose(const struct object_id *oid, time_t mtime) if (has_loose_object(oid)) return 0; - buf = read_object(the_repository, oid, &type, &len); + buf = read_object(the_repository, oid, &type, &len, 0); if (!buf) return error(_("cannot read object for %s"), oid_to_hex(oid)); hdrlen = format_object_header(hdr, sizeof(hdr), type, len); diff --git a/object-store.h b/object-store.h index b1ec0bde82..98c1d67946 100644 --- a/object-store.h +++ b/object-store.h @@ -445,6 +445,9 @@ struct object_info { */ #define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK) +/* Die if object corruption (not just an object being missing) was detected. */ +#define OBJECT_INFO_DIE_IF_CORRUPT 32 + int oid_object_info_extended(struct repository *r, const struct object_id *, struct object_info *, unsigned flags); From patchwork Mon Dec 12 22:48:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13071504 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3CC1C4332F for ; Mon, 12 Dec 2022 22:50:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233961AbiLLWuZ (ORCPT ); Mon, 12 Dec 2022 17:50:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234014AbiLLWt5 (ORCPT ); Mon, 12 Dec 2022 17:49:57 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C6FE60E8 for ; Mon, 12 Dec 2022 14:49:03 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id y23-20020aa78057000000b00574277cb386so755011pfm.16 for ; Mon, 12 Dec 2022 14:49:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JPpuIbJOfBZ40jm8I4bnWOn08D+JMmEeOVygJEeJuNM=; b=BiL5nEGBODsFlj6WiKaXKM0YlI0v5W7JZh/r6NQ/yGDYwoSa8dMwW4qwgR83GxWUO1 m7LKNS5fiYEDk/RUOp9sppNHSnbpyxS0wbUPduyqZEd4n/tzPXXrsqf5vhFHA2VtlonQ bO8axiYO5Mvzw4Zf18PpM6AnmDu0gM/zRWD36iB3z3DTOaR0b1Hq/BHvqhbRiVo+4icc gkwadDN9vSPmS4kuh1ZJIdXY32YFjyYsMFN/eaN/d3CyehN2enxQX0elIUjqm/YcCEZr epCn8+yUN9sOBcPby9gO1Do39rN7OOMt1tyIeBElycLwn7ACTsumw5OTjAr1XcsVYFbY 19CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JPpuIbJOfBZ40jm8I4bnWOn08D+JMmEeOVygJEeJuNM=; b=blXXpwPxI4cnm5Qzc/YC4UC/iRYEkIVna+EOQfTM8VMQUM0u9NmTp4X/g+0++q+JRS QO5rO/rTlmAuCR33eujmagZwo6LRGLsyM9tXowSn2OPJqAn1KnDBXZjsuhE13TOpHfVk 04vS5+VvI8PvvLXjfRFVLnozCLPxTg3ZxtPSI4TQqRgeLA6HsRy/r8dGIOK6Qn0O5rhz Z5J9tZYXZkepuGX3uTCGmRPPSxZiQ2oWy5nzEZJjJTA3+KfzAm8TzucnwGp82Zicgh6f m+wXe7nYrBkItxyaxFAYBQXjgHWeuQHPdCjh620GNSbXWXuNUpV0BH0xn4SbMxeCX2Q+ hp0w== X-Gm-Message-State: ANoB5plNv1E1YmU5kkpNthM1HTG3iOUe+Z64bC5zwvr6pr4cj/IgxcBI 5DWqgtqvvuEAWEG/e2Op2YY0ECbIfmB6brAp6mO3H92vl/vdn+REdqjpFw1D17J/Yv+ITGWRNVI MhzrH6M0S7DozbUtM2ZR3wh9Sd/R281mDQtaBW4PcPC7O3zfBkENJtLTpORtk+9bExu5i4qQY2z kL X-Google-Smtp-Source: AA0mqf5Kx/zg6PP+hPi08rIuhZhsS58nPcT1yFMzsvL2sD/AnTKs458nsu2rXdi4qqhz70Kl/fF1DwIvlQWy71DznUnt X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a17:90a:e691:b0:220:1f03:129b with SMTP id s17-20020a17090ae69100b002201f03129bmr908pjy.0.1670885342578; Mon, 12 Dec 2022 14:49:02 -0800 (PST) Date: Mon, 12 Dec 2022 14:48:51 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: Subject: [PATCH v5 4/4] commit: don't lazy-fetch commits From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When parsing commits, fail fast when the commit is missing or corrupt, instead of attempting to fetch them. This is done by inlining repo_read_object_file() and setting the flag that prevents fetching. This is motivated by a situation in which through a bug (not necessarily through Git), there was corruption in the object store of a partial clone. In this particular case, the problem was exposed when "git gc" tried to expire reflogs, which calls repo_parse_commit(), which triggers fetches of the missing commits. (There are other possible solutions to this problem including passing an argument from "git gc" to "git reflog" to inhibit all lazy fetches, but I think that this fix is at the wrong level - fixing "git reflog" means that this particular command works fine, or so we think (it will fail if it somehow needs to read a legitimately missing blob, say, a .gitmodules file), but fixing repo_parse_commit() will fix a whole class of bugs.) Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- commit.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 572301b80a..a02723f06b 100644 --- a/commit.c +++ b/commit.c @@ -508,6 +508,17 @@ int repo_parse_commit_internal(struct repository *r, enum object_type type; void *buffer; unsigned long size; + struct object_info oi = { + .typep = &type, + .sizep = &size, + .contentp = &buffer, + }; + /* + * Git does not support partial clones that exclude commits, so set + * OBJECT_INFO_SKIP_FETCH_OBJECT to fail fast when an object is missing. + */ + int flags = OBJECT_INFO_LOOKUP_REPLACE | OBJECT_INFO_SKIP_FETCH_OBJECT | + OBJECT_INFO_DIE_IF_CORRUPT; int ret; if (!item) @@ -516,8 +527,8 @@ int repo_parse_commit_internal(struct repository *r, return 0; if (use_commit_graph && parse_commit_in_graph(r, item)) return 0; - buffer = repo_read_object_file(r, &item->object.oid, &type, &size); - if (!buffer) + + if (oid_object_info_extended(r, &item->object.oid, &oi, flags) < 0) return quiet_on_missing ? -1 : error("Could not read %s", oid_to_hex(&item->object.oid));