From patchwork Wed Jun 12 08:53:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13694661 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 622F716D4EF for ; Wed, 12 Jun 2024 08:54:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182451; cv=none; b=oeChNeRuDbWxFURP9A9IxrvzxADOxZBz19Fmy01rr1pyDE327vexlggqwXnznLaY62pcTqDs/w3XksY/mrhFv1ouQaubl0m6aJLLYVqozB0S4yWKuSKqGcSGN1ffe6NnXJPBp+Umx6MayRJOcYpsBEii4WyVpDwKtPISlKTYuzo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182451; c=relaxed/simple; bh=Wt488ytTtHs3k9YyRHY0iLARXB3ECkoCQfArcMw8fd4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bqRZSPg/VAtYKGUMnp6G5+2yG0z320uRRpAup2OSrTTTTx2QJKO88I4b1YK+0menWO1w1EMhWtm/z78kqpKgASIryFZ7KdP8u/n8QxEi4gnf+3rwcstziK5Zvn0uoUCa4z7envPGmEj9sS1PDaQVBxHSe9qfObquWefB02Nbb00= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jGQG3kvT; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jGQG3kvT" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1f4a5344ec7so5699605ad.1 for ; Wed, 12 Jun 2024 01:54:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718182448; x=1718787248; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gDNyAA5zgUU7lYVxSw0ppe+Urq5or0MFtM5ktEyjxDk=; b=jGQG3kvTMo0a/h1cNFtbkvJ+dI02AppagMyE793MYNjGehDseSct7APqwpPGzmI0Ft /DePndV+9XXUuGIWrEUACgpWu4OFyqYm08p1V8Ax5mphIoMb8ndV7Pxpyy9pctwzV+UE 5Op/zv8rKxsyrv29lXQBYN0oqeBvv3lqO9l7t+jAf6TS5S/RIRACI3QWp/UFbtqAGbS2 Lp8+DufLOsR3E/Ck7j4cEcX0uNdV7dgOrZJFfb1lm1vws1usTtLT+3R4LO2LPH0Oq2iv PkjBwKncADV1n8IUpuZmxPQG/BMPnRffRTpveZNOHRWnTBQeHoCTTZqUXa6rIv+vGJFk t+6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718182448; x=1718787248; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gDNyAA5zgUU7lYVxSw0ppe+Urq5or0MFtM5ktEyjxDk=; b=pVh6l3KTAK9JKj/oHX7bh/hA9rrUTLbeIE1p5JkKWlhAM9jjcJStkTJ6+BwrK9Ybws VJR9yLPiE6F4qRpszx/z9Ky/JbZHxF3dYxmeuoTh7/Gl3KJpM9zbbf336MPQPlxRNm+B WOc7OD9NBLicO7QrUUDGusoq9QNe/eNxNx1jNHl9YpWmaorr/e/QhzjA7CsNYd5r3CDu R/FJJwGsh22oUz+8rRn5dnogH9kNoI2Q6V4aRZJK/0zWPKcIxpbagb9aw80ioHktepy4 eTyHw5faXLJgEUMz/og26SV/eGkH83jwF8sdkvjweMrxTag5elrBMUOZmeuYLvXMAiuJ YH+A== X-Gm-Message-State: AOJu0YzxIahs5Vt35oUi8czahbwxNqR0xtBU2q/DMaMuU0wcV0cy3j0h x2wiDv/ToICHedoAri4x/Q6iKDHFXP27RfepyVuIQu9ecZRMxWvDtytlR9Do X-Google-Smtp-Source: AGHT+IEffXsSSOZzJl6rTxiPD4kHSrLEkFlBY9HB3Na50aoKC6NuhYACNXSGC0SZIDyLdcTnt5JqFQ== X-Received: by 2002:a17:902:e80e:b0:1f6:1a86:37ba with SMTP id d9443c01a7336-1f83ae5eccemr22798805ad.2.1718182448082; Wed, 12 Jun 2024 01:54:08 -0700 (PDT) Received: from ArchLinux.localdomain ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6eccc0bd0sm84066855ad.105.2024.06.12.01.54.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Jun 2024 01:54:07 -0700 (PDT) From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine , shejialuo Subject: [GSoC][PATCH v2 1/7] fsck: add refs check interfaces to interface with fsck error levels Date: Wed, 12 Jun 2024 16:53:43 +0800 Message-ID: <20240612085349.710785-2-shejialuo@gmail.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240612085349.710785-1-shejialuo@gmail.com> References: <20240530122753.1114818-1-shejialuo@gmail.com> <20240612085349.710785-1-shejialuo@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The git-fsck(1) focuses on object database consistency check. It relies on the "fsck_options" to interact with fsck error levels. However, "fsck_options" aims at checking the object database which makes it unsuitable to change the semantics of it. Instead, create "fsck_refs_options" structure to handle refs consistency check. The "git_fsck_config" sets up the "msg_type" and "skiplist" member of the "fsck_options". For refs, we just need the "msg_type". In order to allow setting up more refs-specific options easily later, add a separate function "git_fsck_refs_config" to initialize the refs-specific options. Move the "msg_type" and "strict" member to the top of the "fsck_options" which allows us to convert "fsck_refs_options *" to "fsck_options *" to reuse the interfaces provided by "fsck.h" without changing the original code. The static function "report" provided by "fsck.c" aims at reporting the problems related to object database which cannot be reused for refs. Provide "fsck_refs_report" function to integrate the fsck error levels into reference consistency check. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- fsck.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ fsck.h | 39 +++++++++++++++++++++++++++++++-- 2 files changed, 106 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 8ef962199f..13528c646e 100644 --- a/fsck.c +++ b/fsck.c @@ -1249,6 +1249,20 @@ int fsck_buffer(const struct object_id *oid, enum object_type type, type); } +int fsck_refs_error_function(struct fsck_refs_options *o UNUSED, + const char *name, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id UNUSED, + const char *message) +{ + if (msg_type == FSCK_WARN) { + warning("%s: %s", name, message); + return 0; + } + error("%s: %s", name, message); + return 1; +} + int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type UNUSED, @@ -1323,6 +1337,61 @@ int fsck_finish(struct fsck_options *options) return ret; } +int fsck_refs_report(struct fsck_refs_options *o, + const char *name, + enum fsck_msg_id msg_id, + const char *fmt, ...) +{ + va_list ap; + struct strbuf sb = STRBUF_INIT; + enum fsck_msg_type msg_type = + fsck_msg_type(msg_id, (struct fsck_options*)o); + int ret = 0; + + if (msg_type == FSCK_IGNORE) + return 0; + + if (msg_type == FSCK_FATAL) + msg_type = FSCK_ERROR; + else if (msg_type == FSCK_INFO) + msg_type = FSCK_WARN; + + prepare_msg_ids(); + strbuf_addf(&sb, "%s: ", msg_id_info[msg_id].camelcased); + + va_start(ap, fmt); + strbuf_vaddf(&sb, fmt, ap); + ret = o->error_func(o, name, msg_type, msg_id, sb.buf); + strbuf_release(&sb); + va_end(ap); + + return ret; +} + +int git_fsck_refs_config(const char *var, const char *value, + const struct config_context *ctx, void *cb) +{ + struct fsck_refs_options *options = cb; + const char *msg_id; + + /* + * We don't check the value of fsck.skiplist here, because it + * is specific to object database, not reference database. + */ + if (strcmp(var, "fsck.skiplist") == 0) { + return 0; + } + + if (skip_prefix(var, "fsck.", &msg_id)) { + if (!value) + return config_error_nonbool(var); + fsck_set_msg_type((struct fsck_options*)options, msg_id, value); + return 0; + } + + return git_default_config(var, value, ctx, cb); +} + int git_fsck_config(const char *var, const char *value, const struct config_context *ctx, void *cb) { diff --git a/fsck.h b/fsck.h index 17fa2dda5d..7451b1f91b 100644 --- a/fsck.h +++ b/fsck.h @@ -96,6 +96,7 @@ enum fsck_msg_id { }; #undef MSG_ID +struct fsck_refs_options; struct fsck_options; struct object; @@ -107,6 +108,21 @@ void fsck_set_msg_type(struct fsck_options *options, void fsck_set_msg_types(struct fsck_options *options, const char *values); int is_valid_msg_type(const char *msg_id, const char *msg_type); +/* + * callback function for fsck refs and reflogs. + */ +typedef int (*fsck_refs_error)(struct fsck_refs_options *o, + const char *name, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); + +int fsck_refs_error_function(struct fsck_refs_options *o, + const char *name, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); + /* * callback function for fsck_walk * type is the expected type of the object or OBJ_ANY @@ -135,11 +151,22 @@ int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, enum fsck_msg_id msg_id, const char *message); +struct fsck_refs_options { + enum fsck_msg_type *msg_type; + unsigned strict:1; + fsck_refs_error error_func; + unsigned verbose:1; +}; + +#define FSCK_REFS_OPTIONS_DEFAULT { \ + .error_func = fsck_refs_error_function, \ +} + struct fsck_options { + enum fsck_msg_type *msg_type; + unsigned strict:1; fsck_walk_func walk; fsck_error error_func; - unsigned strict:1; - enum fsck_msg_type *msg_type; struct oidset skiplist; struct oidset gitmodules_found; struct oidset gitmodules_done; @@ -221,6 +248,12 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer, */ int fsck_finish(struct fsck_options *options); +__attribute__((format (printf, 4, 5))) +int fsck_refs_report(struct fsck_refs_options *o, + const char *name, + enum fsck_msg_id msg_id, + const char *fmt, ...); + /* * Subsystem for storing human-readable names for each object. * @@ -247,6 +280,8 @@ const char *fsck_describe_object(struct fsck_options *options, const struct object_id *oid); struct key_value_info; +int git_fsck_refs_config(const char *var, const char *value, + const struct config_context *ctx, void *cb); /* * git_config() callback for use by fsck-y tools that want to support * fsck. fsck.skipList etc. From patchwork Wed Jun 12 08:53:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13694662 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E52516D4DE for ; Wed, 12 Jun 2024 08:54:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182456; cv=none; b=lhclmzpjaGkNqMxIsrAPbQrFyrMhhkuTVtci8bmGK+jJVQ2cZ3cUGW50IwgEwT8zDvK6PqE21NcLUnBkKQotqRW6GOaLSHsC1YXYdQqDZd6/ktvW7/WAgOTv2Cnw4h3jK495kNbgs60P0NASorgETGw8q6F14Zz72idZUHWIKg4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182456; c=relaxed/simple; bh=GeM1so+jqF0PEfqMJA5GfVxBAyCKoK4hViYPnZVMLJw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cMomqcLz6FWMJHpXlheTidxyeSwGLkzi8kOzzP02bjd/iI/ruVSsXSwQJXM9OtSnRjUSyHOsNWFROWNK95JDJC6i6/auDszgResEl/U7BBOFiaohbZdMlFDy6bWvTbJSiBD8E99q1k3JWxSNVuCcrvHcQUnnguRUOFM+aQ+WF+s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=mKZ4xcxi; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mKZ4xcxi" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-1f70509b811so25406165ad.1 for ; Wed, 12 Jun 2024 01:54:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718182452; x=1718787252; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=waBbTMwh1Mg0f6uQ7T7X5z4S4xZWlD4xYL5ehF5enUE=; b=mKZ4xcxiK9CdQeUdXJnDcDKzHfYsI4xqgyCPt5tJMDK3AEUFFD9FvmDmyX0CnH6OA0 6cwih79mmFNaU5PALfhpI9vHHjzoANIbSQaXHLTc5nFmu81SHxU36/XfmwNVmkuyAunC OlMU36RlXOiqm2qe0B68nu+nonSCVlDqNqdDxIwhRpTvCaAKdxFLjht6ET574hMuvptR yITN++D0PHr7TFxzbZOiKNINz2ode2loprtj4yTrP1UiqymIJE3T2FCkg9DPgFSH6ZK8 v47m7haAeoJOaeMidgLPYodvbrEAkXS1t2vNVYnEYlw/vKw9aeURewhlpEENGYgPOjxr n6BQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718182452; x=1718787252; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=waBbTMwh1Mg0f6uQ7T7X5z4S4xZWlD4xYL5ehF5enUE=; b=ljEHKY6hTXBwUe5nXq+UvhAxFhiiFE26YRmpR0rYoiGVAItpsOo5UznIf7ZEVbP+hy HSs5MWeoT6RIDIi3CU+vvEjQZCtTlzXKVy1tVi8i8H/uv+fSW+E5iYoSY2sfJCns4UC4 pjX8G5GdDllPW6FEj2M1YIBjb7WqqQ1fQSx2sfp0wLH7JW4CyHLHtWBw4m5h2fNnTuh+ 4N5Vaf/9DctgoJdPtlxE7KMTlaa6d7D14cHYFySpQ8uYouwgo82yVGKV0JO95rT2Rj1m 9XhXE4/1XirMDRwmwmVFW0SozZb74A/y4lCP7nl/idnJAkvk27lGu2+ajIku7HJ8hnYl Ss2Q== X-Gm-Message-State: AOJu0YymVlqzKwCD+w1b6CZ7z9YGIBAUX7wX3wnN2b9vRUqj8cbhuOjT cvLizWTUmuckxmKNBhsIdePuYvl52QBR6ZnOM5mAy4bE5P0t/xO7ZnbLJVT4 X-Google-Smtp-Source: AGHT+IGANwqzh3aqSSSeGwU7l8Vh+M7y7QjrzeD6P2kvgDtzr1WAGiexArqWbWCmrbxIozL33+ydgg== X-Received: by 2002:a17:902:ecc8:b0:1f4:a3a1:a7e5 with SMTP id d9443c01a7336-1f83b566b3dmr13652585ad.13.1718182452380; Wed, 12 Jun 2024 01:54:12 -0700 (PDT) Received: from ArchLinux.localdomain ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6eccc0bd0sm84066855ad.105.2024.06.12.01.54.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Jun 2024 01:54:11 -0700 (PDT) From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine , shejialuo Subject: [GSoC][PATCH v2 2/7] refs: set up ref consistency check infrastructure Date: Wed, 12 Jun 2024 16:53:44 +0800 Message-ID: <20240612085349.710785-3-shejialuo@gmail.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240612085349.710785-1-shejialuo@gmail.com> References: <20240530122753.1114818-1-shejialuo@gmail.com> <20240612085349.710785-1-shejialuo@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The interfaces defined in the `ref_storage_be` are carefully structured in semantic. It's organized as the five parts: 1. The name and the initialization interfaces. 2. The ref transaction interfaces. 3. The ref internal interfaces (pack, rename and copy). 4. The ref filesystem interfaces. 5. The reflog related interfaces. To keep consistent with the git-fsck(1), add a new interface named "fsck_refs_fn" to the end of "ref_storage_be". This semantic cannot be grouped into any above five categories. Explicitly add blank line to make it different from others. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- refs.c | 5 +++++ refs.h | 8 ++++++++ refs/debug.c | 9 +++++++++ refs/files-backend.c | 15 ++++++++++++++- refs/packed-backend.c | 8 ++++++++ refs/refs-internal.h | 6 ++++++ refs/reftable-backend.c | 8 ++++++++ 7 files changed, 58 insertions(+), 1 deletion(-) diff --git a/refs.c b/refs.c index f7c7765d23..0922439275 100644 --- a/refs.c +++ b/refs.c @@ -316,6 +316,11 @@ int check_refname_format(const char *refname, int flags) return check_or_sanitize_refname(refname, flags, NULL); } +int refs_fsck(struct ref_store *refs, struct fsck_refs_options *o) +{ + return refs->be->fsck(refs, o); +} + void sanitize_refname_component(const char *refname, struct strbuf *out) { if (check_or_sanitize_refname(refname, REFNAME_ALLOW_ONELEVEL, out)) diff --git a/refs.h b/refs.h index 76d25df4de..5a042695f5 100644 --- a/refs.h +++ b/refs.h @@ -3,6 +3,7 @@ #include "commit.h" +struct fsck_refs_options; struct object_id; struct ref_store; struct repository; @@ -547,6 +548,13 @@ int refs_for_each_reflog(struct ref_store *refs, each_reflog_fn fn, void *cb_dat */ int check_refname_format(const char *refname, int flags); +/* + * Check the reference database for consistency. Return 0 if refs and + * reflogs are consistent, and non-zero otherwise. The errors will be + * written to stderr. + */ +int refs_fsck(struct ref_store *refs, struct fsck_refs_options *o); + /* * Apply the rules from check_refname_format, but mutate the result until it * is acceptable, and place the result in "out". diff --git a/refs/debug.c b/refs/debug.c index 547d9245b9..110a264522 100644 --- a/refs/debug.c +++ b/refs/debug.c @@ -419,6 +419,13 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname, return res; } +static int debug_fsck(struct ref_store *ref_store, + struct fsck_refs_options *o) +{ + trace_printf_key(&trace_refs, "fsck\n"); + return 0; +} + struct ref_storage_be refs_be_debug = { .name = "debug", .init = NULL, @@ -451,4 +458,6 @@ struct ref_storage_be refs_be_debug = { .create_reflog = debug_create_reflog, .delete_reflog = debug_delete_reflog, .reflog_expire = debug_reflog_expire, + + .fsck = debug_fsck, }; diff --git a/refs/files-backend.c b/refs/files-backend.c index cb752d32b6..e965345ad8 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -3402,6 +3402,17 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store, return ret; } +static int files_fsck(struct ref_store *ref_store, + struct fsck_refs_options *o) +{ + int ret; + struct files_ref_store *refs = + files_downcast(ref_store, REF_STORE_READ, "fsck"); + + ret = refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); + return ret; +} + struct ref_storage_be refs_be_files = { .name = "files", .init = files_ref_store_init, @@ -3428,5 +3439,7 @@ struct ref_storage_be refs_be_files = { .reflog_exists = files_reflog_exists, .create_reflog = files_create_reflog, .delete_reflog = files_delete_reflog, - .reflog_expire = files_reflog_expire + .reflog_expire = files_reflog_expire, + + .fsck = files_fsck, }; diff --git a/refs/packed-backend.c b/refs/packed-backend.c index c4c1e36aa2..db152053f8 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -1733,6 +1733,12 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s return empty_ref_iterator_begin(); } +static int packed_fsck(struct ref_store *ref_store, + struct fsck_refs_options *o) +{ + return 0; +} + struct ref_storage_be refs_be_packed = { .name = "packed", .init = packed_ref_store_init, @@ -1760,4 +1766,6 @@ struct ref_storage_be refs_be_packed = { .create_reflog = NULL, .delete_reflog = NULL, .reflog_expire = NULL, + + .fsck = packed_fsck, }; diff --git a/refs/refs-internal.h b/refs/refs-internal.h index cbcb6f9c36..8f42f21e77 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -4,6 +4,7 @@ #include "refs.h" #include "iterator.h" +struct fsck_refs_options; struct ref_transaction; /* @@ -650,6 +651,9 @@ typedef int read_raw_ref_fn(struct ref_store *ref_store, const char *refname, typedef int read_symbolic_ref_fn(struct ref_store *ref_store, const char *refname, struct strbuf *referent); +typedef int fsck_fn(struct ref_store *ref_store, + struct fsck_refs_options *o); + struct ref_storage_be { const char *name; ref_store_init_fn *init; @@ -677,6 +681,8 @@ struct ref_storage_be { create_reflog_fn *create_reflog; delete_reflog_fn *delete_reflog; reflog_expire_fn *reflog_expire; + + fsck_fn *fsck; }; extern struct ref_storage_be refs_be_files; diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index e555be4671..7f606faa9e 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -2242,6 +2242,12 @@ static int reftable_be_reflog_expire(struct ref_store *ref_store, return ret; } +static int reftable_be_fsck(struct ref_store *ref_store, + struct fsck_refs_options *o) +{ + return 0; +} + struct ref_storage_be refs_be_reftable = { .name = "reftable", .init = reftable_be_init, @@ -2269,4 +2275,6 @@ struct ref_storage_be refs_be_reftable = { .create_reflog = reftable_be_create_reflog, .delete_reflog = reftable_be_delete_reflog, .reflog_expire = reftable_be_reflog_expire, + + .fsck = reftable_be_fsck, }; From patchwork Wed Jun 12 08:53:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13694663 Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0EA5B16D4FF for ; Wed, 12 Jun 2024 08:54:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182458; cv=none; b=pxlMWNCDfFo6hgZliFnbfFjHuwqjgTG15u1KCGbMkLk8dHyfMNzDaOkvwyoZttWh+CBl8v9hzkvRjgCIwcQxMjYtzlDSDsmwFr4xqfaG3oNzhGHp67TvWuzUiww37EmwfQpp7SBMUjZsKpXfOnSidSGhVfAjDB7yONxVbJ9x5FE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182458; c=relaxed/simple; bh=1CnMWowf4t7+W1FR8Kw2Mg/FCZXtdakCe5eFMRI7g+c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qlzQt66f2wrtncxhBspiZLnbYDpS972BNwibeVpcS1kP/fsNY/uN1NOPXwuyDIP3WQ6PENlWmD9BufolpIu40lyP4LIfH7yKg33rDWIWhxePepByHGBn3rbVA8KfiVpzfHt0O+4z3KtbyxUhNAdwmGj0h2JdImdBEBGX/eS4SpU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cB/FTuFe; arc=none smtp.client-ip=209.85.215.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cB/FTuFe" Received: by mail-pg1-f172.google.com with SMTP id 41be03b00d2f7-6818e31e5baso5413864a12.1 for ; Wed, 12 Jun 2024 01:54:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718182456; x=1718787256; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VrO6T0D0pZQGFO+M/+bNOKq5Yopzh5cajPS8m99LEJU=; b=cB/FTuFe4eWnH3ELtMXc7v8p0CUw+ZSpALYLfVFNUyfZ8CO/Oqndd+idNhDDsw3li2 PhPONn+8T5KO9t94FUVZhhAGWbAxQ5aSCYtNT927B1FZr0FkTkZDP/Iubs6RHbbaGxry o2kHReURtsWhoG8f89Ofn6lDG4/A8WQUe43mkGHqXQzDano/BPmLfXakHYaa5/GZqZV6 t75fA/9T4cfDvChxLdYWOjo2OeEwR/fLl0v8kVukyj2SZ1GeMxgtxI5ZbRFU+PoccWO7 3S89kgOcTPyKRkVtKs7dzyno6Fg7lnf3dCrVWG//kKaPVx3SfscxFH9nIyHjEO01/QZM fURQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718182456; x=1718787256; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VrO6T0D0pZQGFO+M/+bNOKq5Yopzh5cajPS8m99LEJU=; b=v9cS0I12/lnaaZ9s1MGeyF4pAZMAwFhmW0GRxo/GeTzYnnjNl/kD7PaopnglxlloTN Zv6bpSkc1UKwlaHRtQNL1BonlbyjykkYyzXTOMSUSKd8yY4ui6Dwo4TeaH+iZsDZi+nC KqXGJXbCapsclgHdkSNp7lq/R7SIwknXO5mR0H3IWC2vabhQwvvDAukP98xUq4Bg+E4l vIwxxduTMBTqsqFJVN/DsV2wzyDmAyXkz1JzCEsIr/DNBPmIPWEyQYMUYsRLTa89R3UG rSd0h8XXh6DSECVsfYWWNSDzq4Dm5Q2GapkSghBmN6EZyCdbrAjyklGuvnw0YY037D7e cXRg== X-Gm-Message-State: AOJu0Yy16267VB1Ws/EjyZQuNa7F3bfI4wAdxLkT973HOKwiVnq2prVC bG0zDtk9jt9s0pEV/PfyAjTodgsXbY6E+jwpeXfq+nJ7YR+QxKKiw9NT3NiV X-Google-Smtp-Source: AGHT+IHjMGN94vF2DAROk3TyuVQzzD5gHqLHsHQl3l7e8azOcMte4A0nbQKBrOs7khoj5+xKFj8kOQ== X-Received: by 2002:a05:6a21:33a8:b0:1b8:7de9:6e3f with SMTP id adf61e73a8af0-1b8a9c79cc0mr1510001637.53.1718182455877; Wed, 12 Jun 2024 01:54:15 -0700 (PDT) Received: from ArchLinux.localdomain ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6eccc0bd0sm84066855ad.105.2024.06.12.01.54.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Jun 2024 01:54:15 -0700 (PDT) From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine , shejialuo Subject: [GSoC][PATCH v2 3/7] builtin/refs: add verify subcommand Date: Wed, 12 Jun 2024 16:53:45 +0800 Message-ID: <20240612085349.710785-4-shejialuo@gmail.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240612085349.710785-1-shejialuo@gmail.com> References: <20240530122753.1114818-1-shejialuo@gmail.com> <20240612085349.710785-1-shejialuo@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Introduce a new subcommand "verify" in git-refs(1) to allow the user to check the reference database consistency. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- Documentation/git-refs.txt | 11 ++++++++++ builtin/refs.c | 45 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt index 5b99e04385..f9d36ea19d 100644 --- a/Documentation/git-refs.txt +++ b/Documentation/git-refs.txt @@ -10,6 +10,7 @@ SYNOPSIS -------- [verse] 'git refs migrate' --ref-format= [--dry-run] +'git refs verify' [--strict] [--verbose] DESCRIPTION ----------- @@ -22,6 +23,9 @@ COMMANDS migrate:: Migrate ref store between different formats. +verify:: + Verify reference database consistency. + OPTIONS ------- @@ -39,6 +43,13 @@ include::ref-storage-format.txt[] can be used to double check that the migration works as expected before performing the actual migration. +--strict:: + Enable more strict checking, every WARN severity for the `Fsck Messages` + be seen as ERROR. + +--verbose:: + When verifying the reference database consistency, be chatty. + KNOWN LIMITATIONS ----------------- diff --git a/builtin/refs.c b/builtin/refs.c index 46dcd150d4..82ed0d57c1 100644 --- a/builtin/refs.c +++ b/builtin/refs.c @@ -1,4 +1,6 @@ #include "builtin.h" +#include "config.h" +#include "fsck.h" #include "parse-options.h" #include "refs.h" #include "repository.h" @@ -7,6 +9,9 @@ #define REFS_MIGRATE_USAGE \ N_("git refs migrate --ref-format= [--dry-run]") +#define REFS_VERIFY_USAGE \ + N_("git refs verify [--strict] [--verbose]") + static int cmd_refs_migrate(int argc, const char **argv, const char *prefix) { const char * const migrate_usage[] = { @@ -58,15 +63,55 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix) return err; } +static int cmd_refs_verify(int argc, const char **argv, const char *prefix) +{ + const char * const verify_usage[] = { + REFS_VERIFY_USAGE, + NULL, + }; + int ret = 0; + unsigned int verbose = 0, strict = 0; + struct fsck_refs_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT; + struct option options[] = { + OPT__VERBOSE(&verbose, N_("be verbose")), + OPT_BOOL(0, "strict", &strict, N_("enable strict checking")), + OPT_END(), + }; + + argc = parse_options(argc, argv, prefix, options, verify_usage, 0); + if (argc) + usage(_("too many arguments")); + + if (verbose) + fsck_refs_options.verbose = 1; + if (strict) + fsck_refs_options.strict = 1; + + git_config(git_fsck_refs_config, &fsck_refs_options); + prepare_repo_settings(the_repository); + + ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options); + + /* + * Explicitly free the allocated array. This is necessary because + * this program is executed as child process of git-fsck(1) and the + * allocated array may not freed when git-fsck(1) aborts somewhere. + */ + free(fsck_refs_options.msg_type); + return ret; +} + int cmd_refs(int argc, const char **argv, const char *prefix) { const char * const refs_usage[] = { REFS_MIGRATE_USAGE, + REFS_VERIFY_USAGE, NULL, }; parse_opt_subcommand_fn *fn = NULL; struct option opts[] = { OPT_SUBCOMMAND("migrate", &fn, cmd_refs_migrate), + OPT_SUBCOMMAND("verify", &fn, cmd_refs_verify), OPT_END(), }; From patchwork Wed Jun 12 08:53:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13694664 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99E7A16D4DE for ; Wed, 12 Jun 2024 08:54:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182461; cv=none; b=GosEH9Rdbem8j4623wedj7vwmrKXzCzptFTbZ5e/Ycv5OQte8E0HWykA1dSzbPtEw7H1AyB8w3zi+ljtJaDOa9auULPUoKQD7rfaUKDrpzYHmuHiLpc5/OMTQTD/orFxke1uibJvyGoOSOYmeTjEjI/19ydKAAWJ48wp5od1BMU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182461; c=relaxed/simple; bh=FCEIirvyWI9fSrdamnTUFyP+fki2xypFIo94MXQtpjo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CctSDnMrv0e2PNeS78ozBKumPYgT7i1SSNGAJMS73yH7CCUlM0eU1GvHwItfyI9FTjFKMSFV4faFHMkdhEuhMqh0J96zH7FNINc6qz3qSmMcPjwkONeB/Ltu6YkBnXqUn4G8DMZK71qUhhGawoJweeCDWXqcWDuTbDOwT5wYZOE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ccinCpRj; arc=none smtp.client-ip=209.85.214.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ccinCpRj" Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1f70c457823so22858505ad.3 for ; Wed, 12 Jun 2024 01:54:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718182459; x=1718787259; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mi/YS2+xeYSf/7U/MQSUxtW0c0qSIfyJZyjsPey06l8=; b=ccinCpRjLVWOg+L16d/B2Nh2ZW1V5Z9Xckyy5wBz0Ldu80AJgYFz+TJmKmh6zAzIdX KG4j6/DRIW+DR8kBBhAE/jIZXEjANjRKBu30j/4sueJMJUdQ251BQoNZ886CwiU0VKPr 2ZEq1KimWpWA1me5BW4ue9fHJ2O4xA8bFUm5FgtJj6jua2l+yOXvW4qbZiVLhn3h2SKv DbdwDF9iKjVqmeVBfejTjrHReO0dVUJ42VdVkbcgOUhAGuICdj1mUWfnLu9ioOK4GjLP fF7FIEGioZPH8bPiA6VnS7yZ5baf3HHBPY4SoG3HPHQj+SbQ0T+raIX52NgSKPbCKDn6 LXGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718182459; x=1718787259; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mi/YS2+xeYSf/7U/MQSUxtW0c0qSIfyJZyjsPey06l8=; b=UM25ROVvtBOv4itOp5j6V5AUokZ3oiPhnrvPFVePoYuXjLvo4tYJeE90GEYbek6mfO /TBjOZrxk4tRPE2csdKEWNBK1SjH0PLdnBl8F3bKOEFW8RDXXGSLXKCCANBySXRBBLP0 t4N7bHbOLHGUEGC2WaeVkGvDygjQCNUmGn1MvisqMy0JDB8vqi1n/niyHjLhaq7jfGWZ 1GCOQQQ6A5ljrCNwwLcWaa0T9lv1WRzMgof96b7bUywbhv3VtGG4NCMyUiLZ62v+8ZB3 elmBExSgdzvJi/rd4oZVubX2T78wRR87jG2kMFTJKNcCggg8XkVbfMVHNaVg76pVQ/Bl drBw== X-Gm-Message-State: AOJu0YxYy7+3boyuDFqkU1yO55F+bDuIZ892zaqD47OQVD2cz8rlm/E5 14mHfaT64vsaDfHNTk0Do48SqGrLIZ1B9Zuxp3K1cR07WLxKxcl9j3YOB+Up X-Google-Smtp-Source: AGHT+IGZoz3cfpvSwQptcxQ8lLddHkQ0OprZaNdlyl+3fEtG6Td2Km+IaOVZIIrEu+eG+KA2Sv0JOQ== X-Received: by 2002:a17:902:e808:b0:1f4:5ad1:b65e with SMTP id d9443c01a7336-1f83b660574mr13934665ad.34.1718182459406; Wed, 12 Jun 2024 01:54:19 -0700 (PDT) Received: from ArchLinux.localdomain ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6eccc0bd0sm84066855ad.105.2024.06.12.01.54.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Jun 2024 01:54:18 -0700 (PDT) From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine , shejialuo Subject: [GSoC][PATCH v2 4/7] builtin/fsck: add `git-refs verify` child process Date: Wed, 12 Jun 2024 16:53:46 +0800 Message-ID: <20240612085349.710785-5-shejialuo@gmail.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240612085349.710785-1-shejialuo@gmail.com> References: <20240530122753.1114818-1-shejialuo@gmail.com> <20240612085349.710785-1-shejialuo@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Introduce a new function "fsck_refs" that initializes and runs a child process to execute the "git-refs verify" command. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- builtin/fsck.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index d13a226c2e..10d73f534f 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -896,6 +896,21 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress) return res; } +static void fsck_refs(void) +{ + struct child_process refs_verify = CHILD_PROCESS_INIT; + child_process_init(&refs_verify); + refs_verify.git_cmd = 1; + strvec_pushl(&refs_verify.args, "refs", "verify", NULL); + if (verbose) + strvec_push(&refs_verify.args, "--verbose"); + if (check_strict) + strvec_push(&refs_verify.args, "--strict"); + + if (run_command(&refs_verify)) + errors_found |= ERROR_REFS; +} + static char const * const fsck_usage[] = { N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n" " [--[no-]full] [--strict] [--verbose] [--lost-found]\n" @@ -1065,6 +1080,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) check_connectivity(); + fsck_refs(); + if (the_repository->settings.core_commit_graph) { struct child_process commit_graph_verify = CHILD_PROCESS_INIT; From patchwork Wed Jun 12 08:53:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13694665 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E903E16D9CB for ; Wed, 12 Jun 2024 08:54:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182466; cv=none; b=sp8pNZdb+UB5h+piZxFeTov2WIxGWBb5fW8lhtuPrngsXsQibCYzjcMSwqeCUZGPRfTr5PtAB5XHaPigBY0m0hka8UlfQKEhhb0VHpsnA5bkpErOmUd5QZwBUYbYzlmhSLYuIiu17woSCPTeTgp6p69VFRZCet8nBBZBwnRcavI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182466; c=relaxed/simple; bh=KfoyAKJWPGHc8g6D2xYX+CKMdHKxCm5taJ9cU92++JI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JRFvm8zau9lY9VRXa12oU/rvOEEmmbbzCpt0s2RcgToUX9AtUNOxYwKmttfZe8ahUuIIMEAYo1ZBsNicnnKqhV9ztGHgXbLU0rYWJmF1F04nhOMKNr1B7pprOw7tEWkrAylnp0mbp5q2V2iKdvmnv67ZM2LfLan8t3BMFZkX3Os= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hn3dHZ6w; arc=none smtp.client-ip=209.85.214.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hn3dHZ6w" Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1f6f1677b26so5301745ad.0 for ; Wed, 12 Jun 2024 01:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718182463; x=1718787263; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Adr1aTY4PIcN/p1Sima3KZzpZR/F0+dyYceAUbihrUM=; b=hn3dHZ6wmU53cpkTiFyYcd0Z+qsF8XWgcc3uodF2JdeijxXaZu72PFQ1oA3k1vIq6S FVS3yaIbg1y599jhIRchgzDqJs0QKd4wXJwnxARt3VZsmgpqDhHys/pl5NqBMSZDpLQK TBoslcgS5A/p/d1YymWJUrs8eyUBsRX1C4w41u+Xki41O0yd0e+co1tT2/VeJEeRkqtt qqt03IICbGeNI6EqxM+PFt6Y5zakdUKsUCLOc3xq7FblV4b+l8SzVD/cNypqqQKiDzDW kYBzaBorQoPdDPnftFnP4/SAs1LVDvGJlBwWBS8UrRzbc+ueliiy0LmfUxkZclevAC8C y5EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718182463; x=1718787263; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Adr1aTY4PIcN/p1Sima3KZzpZR/F0+dyYceAUbihrUM=; b=TnHHtvxnutWv20+yWt3sc3H/csfkY8cUO7uKctjYl283ifDHosKRkXPkeLrCAVmwm9 sP/4UrK6VTHv7kHeuqYbn5zpSlHFUGgg/n6wbcWYj2MSuRh55stxqbJbNR/uH8cvxP0f 3Y2zlnS+8MuZgs8TqPYmsyzLpx/EDUmakWhkhmc45Usnd3BkHSDlQ9iMiSKjFeJYcrzd ghcYWQS2VV424KB6Ko/6tDsXtVURtBmUGPxkiSVSuohkq8vwt0C4z+Tkfu5ut3eHMeWM IGUTY0QXoZPbnXbJb6qCEUmrw9oRVdu50nGU646iAVz3cbF9YQn/y/N9obSqichEUyBc PDmQ== X-Gm-Message-State: AOJu0YzxpLeZIPNsaYxAAC5+mvH0JWXi4YM4zv9eBYyEBio2t7n5mrBi TaUfZldbsvXt+BC8wvCOOxpoAGePDFxIrcizLVDUVgZxEz/Oe4rkGczx5+ZX X-Google-Smtp-Source: AGHT+IHAFinF8LOd8/bsToZpzOSZvWahHFPtLVXdT6bE4CsU4pZT2OZU46uTh839JZMOXqiqyigFgQ== X-Received: by 2002:a17:902:bf47:b0:1f7:12f5:d7d6 with SMTP id d9443c01a7336-1f83b1b31f8mr14113655ad.33.1718182462672; Wed, 12 Jun 2024 01:54:22 -0700 (PDT) Received: from ArchLinux.localdomain ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6eccc0bd0sm84066855ad.105.2024.06.12.01.54.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Jun 2024 01:54:22 -0700 (PDT) From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine , shejialuo Subject: [GSoC][PATCH v2 5/7] files-backend: add unified interface for refs scanning Date: Wed, 12 Jun 2024 16:53:47 +0800 Message-ID: <20240612085349.710785-6-shejialuo@gmail.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240612085349.710785-1-shejialuo@gmail.com> References: <20240530122753.1114818-1-shejialuo@gmail.com> <20240612085349.710785-1-shejialuo@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 For refs and reflogs, we need to scan its corresponding directories to check every regular file or symbolic link which shares the same pattern. Introduce a unified interface for scanning directories for files-backend. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- refs/files-backend.c | 75 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) diff --git a/refs/files-backend.c b/refs/files-backend.c index e965345ad8..b26cfb8ba6 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -4,6 +4,7 @@ #include "../gettext.h" #include "../hash.h" #include "../hex.h" +#include "../fsck.h" #include "../refs.h" #include "refs-internal.h" #include "ref-cache.h" @@ -3402,6 +3403,78 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store, return ret; } +/* + * For refs and reflogs, they share a unified interface when scanning + * the whole directory. This function is used as the callback for each + * regular file or symlink in the directory. + */ +typedef int (*files_fsck_refs_fn)(struct fsck_refs_options *o, + const char *gitdir, + const char *refs_check_dir, + struct dir_iterator *iter); + +static int files_fsck_refs_dir(struct ref_store *ref_store, + struct fsck_refs_options *o, + const char *refs_check_dir, + files_fsck_refs_fn *fsck_refs_fns) +{ + const char *gitdir = ref_store->gitdir; + struct strbuf sb = STRBUF_INIT; + struct dir_iterator *iter; + int iter_status; + int ret = 0; + + strbuf_addf(&sb, "%s/%s", gitdir, refs_check_dir); + + iter = dir_iterator_begin(sb.buf, 0); + + if (!iter) { + ret = error_errno("cannot open directory %s", sb.buf); + goto out; + } + + while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) { + if (S_ISDIR(iter->st.st_mode)) { + continue; + } else if (S_ISREG(iter->st.st_mode) || + S_ISLNK(iter->st.st_mode)) { + if (o->verbose) + fprintf_ln(stderr, "Checking %s/%s", + refs_check_dir, iter->relative_path); + for (size_t i = 0; fsck_refs_fns[i]; i++) { + if (fsck_refs_fns[i](o, gitdir, refs_check_dir, iter)) + ret = -1; + } + } else { + ret = error(_("unexpected file type for '%s'"), + iter->basename); + } + } + + if (iter_status != ITER_DONE) + ret = error(_("failed to iterate over '%s'"), sb.buf); + +out: + strbuf_release(&sb); + return ret; +} + +static int files_fsck_refs(struct ref_store *ref_store, + struct fsck_refs_options *o) +{ + int ret; + files_fsck_refs_fn fsck_refs_fns[]= { + NULL + }; + + if (o->verbose) + fprintf_ln(stderr, "Checking references consistency"); + + ret = files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fns); + + return ret; +} + static int files_fsck(struct ref_store *ref_store, struct fsck_refs_options *o) { @@ -3410,6 +3483,8 @@ static int files_fsck(struct ref_store *ref_store, files_downcast(ref_store, REF_STORE_READ, "fsck"); ret = refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); + ret = files_fsck_refs(ref_store, o); + return ret; } From patchwork Wed Jun 12 08:53:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13694666 Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E2A016DEA6 for ; Wed, 12 Jun 2024 08:54:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182469; cv=none; b=Xoqwx6SwXdHWyB4lymKT3K5a0hBtRxDZ0NyVYKmBiyo/6Ume0axngZ/zFWXncjmyI1/k38AdhkTSLkvuIHJVn1Smza2qKRcipOJSVchhpmO0EJ8gmxS2zvm3OPaliSgg8+RA2wt89DZFgOLP+ErCzPxc+h/yXCAA7x9jjTWNMnc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182469; c=relaxed/simple; bh=DmofjeiPW0s7YisT51qYvlRaRPxbzyjimnZLHdiPRFc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hvYgHeiLewn5RNaIT62PAET93LxlNA7r2llDHze4Zr00wUiIHRdJhhZ+gDDLrkcl+Mih/u26SoG+8hesdaAxQESaeGLHnBUnLumzlwV/tVZl0ya3KewYEJ4ZNVLMFFOFdFTa2oSWoFdBg5L3vbyK5C9vOouB3yMYHY9MgkJOiIY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=foDmiUTH; arc=none smtp.client-ip=209.85.215.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="foDmiUTH" Received: by mail-pg1-f181.google.com with SMTP id 41be03b00d2f7-6cdf9a16229so4954338a12.3 for ; Wed, 12 Jun 2024 01:54:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718182466; x=1718787266; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HmTZ7GeZwY9bRozRLpbN1magdYRA519EicrofagrIno=; b=foDmiUTHbXlhZo+f9VEi6zLy0vvlweI2/Aoh0jbABMQ1JATWxQ6fyuZCDpeQ3P5oNu DWdKMyLp5FFk9uUgVtacfefoOm+ous8PNz1l1R26rLCk9A9xVFaooVFeZeLQ1wC0DuEb tZfa6Xo3VtyfRlwxpbxGFbnBgpQA3S6Rgs4CDMa2Bo98HuGVcfuOo92O7uYHCyVnfHYW eqVQvOrryYV3ELosQPfvlulD8gZTUl/usD/j+oFKY0y5Nok41vfZykXHQACv7T1JJqNe VgJ0c42Fkyl1kGWw7NxUOxBegr1bILlVyo8YhqsNeXGSpuR9bOhWVmqFu05alEVxPAZa 6yQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718182466; x=1718787266; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HmTZ7GeZwY9bRozRLpbN1magdYRA519EicrofagrIno=; b=AEkm0XWBtWJiPP00xNvmvYlc28+i1A4u3bku270u6GbNuVxA3x7f8uRFm32Ot8DU5x VwwJlURY8mS9RHeEFqxotg2IOO+XgACn+Oeh4X2emtP57qEk+ZalK34nvkvHYbG1RoAM t7vVSQluqq/m1MXH5Xc0Lqlxy3c448PbzJVTY06aEL1/mjOmphGNNfpsimjiBeQS0Z0A LM4NhkCNbaGUpt1bxKrK3ojHjPH5SEettn9zZnZutK5qr/5On0KVrkAeEwvRABijXhrD bFpcMRm39m4aQXtuLmSk2T/pt63v/N11y5nbsCLMUwZbXYlrzsoqANj62bf/cQ4mG44w EJEA== X-Gm-Message-State: AOJu0YwyCcmfPU4pRxl2FA+mPhzBgoN3lr5YZ/XIEL4FjicrAgu0U8cI Gz2NzytCqTC1AMdPddCoObvvjiCyg9J3htwOwxfNl0FFes4U5Z7y86HzWTj7 X-Google-Smtp-Source: AGHT+IGxlgESVj/L27xPkciJ38e2TEJ5XziaDb00hnkCYhFcnVnJwobzxhD0mBzAyNXUldH38bd79w== X-Received: by 2002:a05:6a20:432a:b0:1b2:b220:2daf with SMTP id adf61e73a8af0-1b8a9b6c801mr1549606637.6.1718182465961; Wed, 12 Jun 2024 01:54:25 -0700 (PDT) Received: from ArchLinux.localdomain ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6eccc0bd0sm84066855ad.105.2024.06.12.01.54.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Jun 2024 01:54:25 -0700 (PDT) From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine , shejialuo Subject: [GSoC][PATCH v2 6/7] fsck: add ref name check for files backend Date: Wed, 12 Jun 2024 16:53:48 +0800 Message-ID: <20240612085349.710785-7-shejialuo@gmail.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240612085349.710785-1-shejialuo@gmail.com> References: <20240530122753.1114818-1-shejialuo@gmail.com> <20240612085349.710785-1-shejialuo@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The git-fsck(1) only implicitly checks the reference, it does not fully check refs with bad format name such as standalone "@" and name ending with ".lock". In order to provide such checks, add a new fsck message id "badRefName" with default ERROR type. Use existing "check_refname_format" to explicit check the ref name. And add a new unit test to verify the functionality. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- Documentation/fsck-msgids.txt | 3 + fsck.h | 1 + refs/files-backend.c | 20 +++++++ t/t0602-reffiles-fsck.sh | 101 ++++++++++++++++++++++++++++++++++ 4 files changed, 125 insertions(+) create mode 100755 t/t0602-reffiles-fsck.sh diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt index 5edc06c658..cc85c897cc 100644 --- a/Documentation/fsck-msgids.txt +++ b/Documentation/fsck-msgids.txt @@ -19,6 +19,9 @@ `badParentSha1`:: (ERROR) A commit object has a bad parent sha1. +`badRefName`:: + (ERROR) A ref has a bad name. + `badTagName`:: (INFO) A tag has an invalid format. diff --git a/fsck.h b/fsck.h index 7451b1f91b..1423a5e428 100644 --- a/fsck.h +++ b/fsck.h @@ -31,6 +31,7 @@ enum fsck_msg_type { FUNC(BAD_NAME, ERROR) \ FUNC(BAD_OBJECT_SHA1, ERROR) \ FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_REF_NAME, ERROR) \ FUNC(BAD_TIMEZONE, ERROR) \ FUNC(BAD_TREE, ERROR) \ FUNC(BAD_TREE_SHA1, ERROR) \ diff --git a/refs/files-backend.c b/refs/files-backend.c index b26cfb8ba6..266f1ffe8a 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -3413,6 +3413,25 @@ typedef int (*files_fsck_refs_fn)(struct fsck_refs_options *o, const char *refs_check_dir, struct dir_iterator *iter); +static int files_fsck_refs_name(struct fsck_refs_options *o, + const char *gitdir UNUSED, + const char *refs_check_dir, + struct dir_iterator *iter) +{ + struct strbuf sb = STRBUF_INIT; + int ret = 0; + + if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) { + strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path); + ret = fsck_refs_report(o, sb.buf, + FSCK_MSG_BAD_REF_NAME, + "invalid refname format"); + } + + strbuf_release(&sb); + return ret; +} + static int files_fsck_refs_dir(struct ref_store *ref_store, struct fsck_refs_options *o, const char *refs_check_dir, @@ -3464,6 +3483,7 @@ static int files_fsck_refs(struct ref_store *ref_store, { int ret; files_fsck_refs_fn fsck_refs_fns[]= { + files_fsck_refs_name, NULL }; diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh new file mode 100755 index 0000000000..b2db58d2c6 --- /dev/null +++ b/t/t0602-reffiles-fsck.sh @@ -0,0 +1,101 @@ +#!/bin/sh + +test_description='Test reffiles backend consistency check' + +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +GIT_TEST_DEFAULT_REF_FORMAT=files +export GIT_TEST_DEFAULT_REF_FORMAT + +. ./test-lib.sh + +test_expect_success 'ref name should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 && + git commit --allow-empty -m second && + git checkout -b branch-2 && + git tag tag-2 && + git tag multi_hierarchy/tag-2 + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/heads/.branch-1: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/.branch-1 && + test_cmp expect err + ) && + ( + cd repo && + cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/tags/tag-1.lock: badRefName: invalid refname format + EOF + rm $tag_dir_prefix/tag-1.lock && + test_cmp expect err + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/heads/@: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/@ && + test_cmp expect err + ) && + ( + cd repo && + cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format + EOF + rm $tag_dir_prefix/multi_hierarchy/@ && + test_cmp expect err + ) +' + +test_expect_success 'ref name check should be adapted into fsck messages' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 && + git commit --allow-empty -m second && + git checkout -b branch-2 && + git tag tag-2 + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 && + git -c fsck.badRefName=warn fsck 2>err && + cat >expect <<-EOF && + warning: refs/heads/.branch-1: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/.branch-1 && + test_cmp expect err + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ && + git -c fsck.badRefName=ignore fsck 2>err && + test_must_be_empty err + ) +' + +test_done From patchwork Wed Jun 12 08:53:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13694667 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC3A416DEAE for ; Wed, 12 Jun 2024 08:54:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182472; cv=none; b=ojXE0+SKve6fFYumWwEzF/saZgPO5U4ywvtiUAI6jD0sy9cMjzg4801ki7qrQeaCzeKcZ8W1gVgb1BSn7n7eOoW7f80SV4+Yr6QEzwvJ+EyXpaMfT1f1NHCxt9UCD7ZKvMVxX4dyfWvKe2xXFbyjIzNzqqSsBEbniB25FsvHul8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718182472; c=relaxed/simple; bh=Sfljw8qVp+sFJpceszddLstawDDr8lwdaz/DoSaz9uA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MBR+4jkmWK243CvdLeoFJYs0wFpqsyTdt1LQObKw5IkDZtPgdru0F5/j3jYqrhN35TNwikb26subMGxh0KpTU9es7z1OvRxx18UM+O7wZ1qAuLq0cK9OoJq6o/6T9TlagZ82kUNG59nbawsphD/uvvRbTprjZ2Xa5AzLh5JTlpA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=VUp1ARoj; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VUp1ARoj" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1f4a0050b9aso54997415ad.2 for ; Wed, 12 Jun 2024 01:54:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718182469; x=1718787269; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KTp+QBI2qGOD2VscbMVPvEzZOtNTzzonbmg9kNIsKXU=; b=VUp1ARojKlCIqdTppMPA6GfXX+YxywqF08cf4oML2uLrXtZTpDbNUtZOHkNh1Q18KU JSYl1gMmWoFJFjZA/q7VP+tHhhckA2pNUfip3rQI7cONY8bw79p5xYPfKC88DgH0HC9b VBYgj8HBZf3FfYMzFDIyp4VWiMXQ2WQHK/sbDehC3i1SbYN/rW++RJoigEnP74wvTEOw ZFv5TkJKSyVoqkG1obzin7DxHojrfaQbVCWmCdGq7GaeRVYGvi/M5zptpZc3E5SNabl6 cIRppjXpqqbWdNYAc3216l+kFGsZFEEi2glpMXVUQIhsCOO5z3BNs6M3tHf0Hb9PXsmj 38Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718182469; x=1718787269; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KTp+QBI2qGOD2VscbMVPvEzZOtNTzzonbmg9kNIsKXU=; b=Kp3fZdzTFJjXHiFcfkXJhgAQkEx82D5An6tpKcR+TjN4Vaiajt/MA/8iF5FgJfVxV1 QsICiwNE449q+CGZlRuniCihjW6VGSBWRKuyGwi5anibE4M2IR3DmBVSQXh0xprHNvT6 8tAwGHVnfOCP4IVAJAUJ659VIVaMj9nZop5MoU9Q2OlV3gW5n85JOJWhq1Smgl4RxsqB MuXgaXqeVaIxLY0WFZMB99AF32pn5PT8GZ7Cclks5dmGBQX9gjLLFZfeoIkiyDthHyEE TDEsA5cIprZbA7JumuwFX6syZ4LeTJWhgrLi85DWDPyuRqy5psu3Igyx2lkyYFYwFnT9 /PlA== X-Gm-Message-State: AOJu0YzqinxUVZlTW8ZMDWFDc0XDsS4I/YtP7jyADzZrya5SBzg9nEFK ZStaSshdkuvtIqXy64Bi1nlUHDsjtP7yvhMbJFvQyMGn/M0gweX0eKabmsmx X-Google-Smtp-Source: AGHT+IEpX74RcUIzRKtKKEsGRUxCCIrGln5fqqNa0dR3+xZQwJpTUJQeyyQMdY66E2cgGZf8Xw+EOg== X-Received: by 2002:a17:902:d4c1:b0:1f3:b55:e247 with SMTP id d9443c01a7336-1f83b703055mr10794635ad.55.1718182469336; Wed, 12 Jun 2024 01:54:29 -0700 (PDT) Received: from ArchLinux.localdomain ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6eccc0bd0sm84066855ad.105.2024.06.12.01.54.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Jun 2024 01:54:28 -0700 (PDT) From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine , shejialuo Subject: [GSoC][PATCH v2 7/7] fsck: add ref content check for files backend Date: Wed, 12 Jun 2024 16:53:49 +0800 Message-ID: <20240612085349.710785-8-shejialuo@gmail.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240612085349.710785-1-shejialuo@gmail.com> References: <20240530122753.1114818-1-shejialuo@gmail.com> <20240612085349.710785-1-shejialuo@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Enhance the git-fsck(1) command by adding a check for reference content in the files backend. The new functionality ensures that symrefs, real symbolic link and regular refs are validated correctly. Add a new function "files_fsck_symref" to check whether the symrefs and symbolic link points to a valid object and a new function "files_fsck_refs_content" handles both regular refs and symbolic refs. In order to check the trailing content, add a new parameter "trailing" to "parse_loose_ref_contents" function. Last, add the following FSCK MESSAGEs: 1. "badRefSha(ERROR)": A ref has a bad sha. 2. "danglingSymre(WARN)": Found a dangling symref. 3. "trailingRefContent(WARN)": A ref content has trailing contents. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- Documentation/fsck-msgids.txt | 9 +++ fsck.h | 3 + refs.c | 2 +- refs/files-backend.c | 124 +++++++++++++++++++++++++++++++++- refs/refs-internal.h | 5 +- t/t0602-reffiles-fsck.sh | 110 ++++++++++++++++++++++++++++++ 6 files changed, 248 insertions(+), 5 deletions(-) diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt index cc85c897cc..69f86c5345 100644 --- a/Documentation/fsck-msgids.txt +++ b/Documentation/fsck-msgids.txt @@ -22,6 +22,9 @@ `badRefName`:: (ERROR) A ref has a bad name. +`badRefSha`:: + (ERROR) A ref has a bad sha. + `badTagName`:: (INFO) A tag has an invalid format. @@ -37,6 +40,9 @@ `badType`:: (ERROR) Found an invalid object type. +`danglingSymref`:: + (WARN) Found a dangling symref. + `duplicateEntries`:: (ERROR) A tree contains duplicate file entries. @@ -179,6 +185,9 @@ `symlinkTargetMissing`:: (ERROR) Unable to read symbolic link target's blob. +`trailingRefContent`:: + (WARN) A ref content has trailing contents. + `treeNotSorted`:: (ERROR) A tree is not properly sorted. diff --git a/fsck.h b/fsck.h index 1423a5e428..5a55a567b0 100644 --- a/fsck.h +++ b/fsck.h @@ -32,6 +32,7 @@ enum fsck_msg_type { FUNC(BAD_OBJECT_SHA1, ERROR) \ FUNC(BAD_PARENT_SHA1, ERROR) \ FUNC(BAD_REF_NAME, ERROR) \ + FUNC(BAD_REF_SHA, ERROR) \ FUNC(BAD_TIMEZONE, ERROR) \ FUNC(BAD_TREE, ERROR) \ FUNC(BAD_TREE_SHA1, ERROR) \ @@ -69,11 +70,13 @@ enum fsck_msg_type { FUNC(SYMLINK_TARGET_BLOB, ERROR) \ /* warnings */ \ FUNC(EMPTY_NAME, WARN) \ + FUNC(DANGLING_SYMREF, WARN) \ FUNC(FULL_PATHNAME, WARN) \ FUNC(HAS_DOT, WARN) \ FUNC(HAS_DOTDOT, WARN) \ FUNC(HAS_DOTGIT, WARN) \ FUNC(NULL_SHA1, WARN) \ + FUNC(TRAILING_REF_CONTENT, WARN) \ FUNC(ZERO_PADDED_FILEMODE, WARN) \ FUNC(NUL_IN_COMMIT, WARN) \ FUNC(LARGE_PATHNAME, WARN) \ diff --git a/refs.c b/refs.c index 0922439275..1325f83269 100644 --- a/refs.c +++ b/refs.c @@ -1744,7 +1744,7 @@ static int refs_read_special_head(struct ref_store *ref_store, } result = parse_loose_ref_contents(content.buf, oid, referent, type, - failure_errno); + failure_errno, NULL); done: strbuf_release(&full_path); diff --git a/refs/files-backend.c b/refs/files-backend.c index 266f1ffe8a..17d3e433f1 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -549,7 +549,7 @@ static int read_ref_internal(struct ref_store *ref_store, const char *refname, strbuf_rtrim(&sb_contents); buf = sb_contents.buf; - ret = parse_loose_ref_contents(buf, oid, referent, type, &myerr); + ret = parse_loose_ref_contents(buf, oid, referent, type, &myerr, NULL); out: if (ret && !myerr) @@ -585,7 +585,7 @@ static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn int parse_loose_ref_contents(const char *buf, struct object_id *oid, struct strbuf *referent, unsigned int *type, - int *failure_errno) + int *failure_errno, unsigned int *trailing) { const char *p; if (skip_prefix(buf, "ref:", &buf)) { @@ -607,6 +607,10 @@ int parse_loose_ref_contents(const char *buf, struct object_id *oid, *failure_errno = EINVAL; return -1; } + + if (trailing && (*p != '\0' && *p != '\n')) + *trailing = 1; + return 0; } @@ -3432,6 +3436,121 @@ static int files_fsck_refs_name(struct fsck_refs_options *o, return ret; } +static int files_fsck_symref(struct fsck_refs_options *o, + struct strbuf *refname, + struct strbuf *path) +{ + struct stat st; + int ret = 0; + + if (lstat(path->buf, &st) < 0) { + ret = fsck_refs_report(o, refname->buf, + FSCK_MSG_DANGLING_SYMREF, + "point to non-existent ref"); + goto out; + } + + if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) { + ret = fsck_refs_report(o, refname->buf, + FSCK_MSG_DANGLING_SYMREF, + "point to invalid object"); + goto out; + } +out: + return ret; +} + +static int files_fsck_refs_content(struct fsck_refs_options *o, + const char *gitdir, + const char *refs_check_dir, + struct dir_iterator *iter) +{ + struct strbuf path = STRBUF_INIT, + refname = STRBUF_INIT, + ref_content = STRBUF_INIT, + referent = STRBUF_INIT; + unsigned int trailing = 0; + int failure_errno = 0; + unsigned int type = 0; + struct object_id oid; + int ret = 0; + + strbuf_addbuf(&path, &iter->path); + strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path); + + /* + * If the file is a symlink, we need to only check the connectivity + * of the destination object. + */ + if (S_ISLNK(iter->st.st_mode)) { + strbuf_strip_file_from_path(&path); + ret = strbuf_readlink(&ref_content, + iter->path.buf, iter->st.st_size); + if (ret < 0) { + ret = error_errno(_("could not read link '%s'"), + iter->path.buf); + goto clean; + } + strbuf_addbuf(&path, &ref_content); + strbuf_reset(&ref_content); + + ret = files_fsck_symref(o, &refname, &path); + goto clean; + } + + if (strbuf_read_file(&ref_content, path.buf, 0) < 0) { + ret = error_errno(_("%s/%s: unable to read the ref"), + refs_check_dir, iter->relative_path); + goto clean; + } + + if (parse_loose_ref_contents(ref_content.buf, &oid, + &referent, &type, + &failure_errno, &trailing)) { + ret = fsck_refs_report(o, refname.buf, + FSCK_MSG_BAD_REF_SHA, + "invalid ref content"); + goto clean; + } + + /* + * If the ref is a symref, we need to check the destination name and + * connectivity. + */ + if (referent.len && (type & REF_ISSYMREF)) { + strbuf_reset(&path); + strbuf_addf(&path, "%s/%s", gitdir, referent.buf); + + if (check_refname_format(referent.buf, 0)) { + ret = fsck_refs_report(o, refname.buf, + FSCK_MSG_DANGLING_SYMREF, + "point to invalid refname"); + goto clean; + } + + ret = files_fsck_symref(o, &refname, &path); + goto clean; + } else { + /* + * Only regular refs could have a trailing garbage. Should + * be reported as a warning. + */ + if (trailing) { + ret = fsck_refs_report(o, refname.buf, + FSCK_MSG_TRAILING_REF_CONTENT, + "trailing garbage in ref"); + goto clean; + } + } + +clean: + strbuf_release(&path); + strbuf_release(&refname); + strbuf_release(&ref_content); + strbuf_release(&referent); + return ret; +} + static int files_fsck_refs_dir(struct ref_store *ref_store, struct fsck_refs_options *o, const char *refs_check_dir, @@ -3484,6 +3603,7 @@ static int files_fsck_refs(struct ref_store *ref_store, int ret; files_fsck_refs_fn fsck_refs_fns[]= { files_fsck_refs_name, + files_fsck_refs_content, NULL }; diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 8f42f21e77..eb3a7cdcc1 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -709,11 +709,12 @@ struct ref_store { /* * Parse contents of a loose ref file. *failure_errno maybe be set to EINVAL for - * invalid contents. + * invalid contents. Also trailing is set to 1 when there is any bytes after the + * hex. */ int parse_loose_ref_contents(const char *buf, struct object_id *oid, struct strbuf *referent, unsigned int *type, - int *failure_errno); + int *failure_errno, unsigned int *trailing); /* * Fill in the generic part of refs and add it to our collection of diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh index b2db58d2c6..94cb93bf92 100755 --- a/t/t0602-reffiles-fsck.sh +++ b/t/t0602-reffiles-fsck.sh @@ -98,4 +98,114 @@ test_expect_success 'ref name check should be adapted into fsck messages' ' ) ' +test_expect_success 'regular ref content should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 && + git commit --allow-empty -m second && + git checkout -b branch-2 && + git tag tag-2 && + git checkout -b a/b/tag-2 + ) && + ( + cd repo && + printf "%s garbage" "$(git rev-parse branch-1)" > $branch_dir_prefix/branch-1-garbage && + git fsck 2>err && + cat >expect <<-EOF && + warning: refs/heads/branch-1-garbage: trailingRefContent: trailing garbage in ref + EOF + rm $branch_dir_prefix/branch-1-garbage && + test_cmp expect err + ) && + ( + cd repo && + printf "%s garbage" "$(git rev-parse tag-1)" > $tag_dir_prefix/tag-1-garbage && + test_must_fail git -c fsck.trailingRefContent=error fsck 2>err && + cat >expect <<-EOF && + error: refs/tags/tag-1-garbage: trailingRefContent: trailing garbage in ref + EOF + rm $tag_dir_prefix/tag-1-garbage && + test_cmp expect err + ) && + ( + cd repo && + printf "%s " "$(git rev-parse tag-2)" > $tag_dir_prefix/tag-2-garbage && + git fsck 2>err && + cat >expect <<-EOF && + warning: refs/tags/tag-2-garbage: trailingRefContent: trailing garbage in ref + EOF + rm $tag_dir_prefix/tag-2-garbage && + test_cmp expect err + ) && + ( + cd repo && + printf "xfsazqfxcadas" > $tag_dir_prefix/tag-2-bad && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/tags/tag-2-bad: badRefSha: invalid ref content + EOF + rm $tag_dir_prefix/tag-2-bad && + test_cmp expect err + ) && + ( + cd repo && + printf "xfsazqfxcadas" > $branch_dir_prefix/a/b/branch-2-bad && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/heads/a/b/branch-2-bad: badRefSha: invalid ref content + EOF + rm $branch_dir_prefix/a/b/branch-2-bad && + test_cmp expect err + ) +' + +test_expect_success 'symbolic ref content should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 + ) && + ( + cd repo && + printf "ref: refs/heads/branch-3" > $branch_dir_prefix/branch-2-bad && + git refs verify 2>err && + cat >expect <<-EOF && + warning: refs/heads/branch-2-bad: danglingSymref: point to non-existent ref + EOF + rm $branch_dir_prefix/branch-2-bad && + test_cmp expect err + ) && + ( + cd repo && + printf "ref: refs/heads/.branch" > $branch_dir_prefix/branch-2-bad && + git refs verify 2>err && + cat >expect <<-EOF && + warning: refs/heads/branch-2-bad: danglingSymref: point to invalid refname + EOF + rm $branch_dir_prefix/branch-2-bad && + test_cmp expect err + ) && + ( + cd repo && + printf "ref: refs/heads" > $branch_dir_prefix/branch-2-bad && + git refs verify 2>err && + cat >expect <<-EOF && + warning: refs/heads/branch-2-bad: danglingSymref: point to invalid object + EOF + rm $branch_dir_prefix/branch-2-bad && + test_cmp expect err + ) +' + test_done