From patchwork Tue May 10 19:26:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12845449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 943D5C433FE for ; Tue, 10 May 2022 19:27:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239805AbiEJT1c (ORCPT ); Tue, 10 May 2022 15:27:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236093AbiEJT1P (ORCPT ); Tue, 10 May 2022 15:27:15 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A83A2C128 for ; Tue, 10 May 2022 12:27:09 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id d5so8145wrb.6 for ; Tue, 10 May 2022 12:27:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=v2bVxgiXyAJb5mJiV2jLmDxYyCHvoUPxWAqoNPpKzrk=; b=ZFChZdDA7j+F47Wdxf7nG5si6H6/eP/C+jLOWvpEzr6wsUy6A2Gnbk7s2HS7ufErNt 6hNRfNA/YwKNgJwcalg80jMNEDSDHuUaOB4QJ1y4vPobXtqILldozyv7jmvqsJr4UvGm xsW5dANwjv6WJCI+knAX/antBw3sD8YwvEwUui12mzd6ItNVjV7+BwpVvWqUJOaA7Ag1 Tps69Pg9RVI/tMpAj9REu3sK8MiI1+6UQagr7sXdYeQmeYDEDvXAuVYM3ruAUj/wxFQ0 DT4B05OXsWf8ex03MZpXeCMAjLVlGf/T0unZrPTyMr1JLmkNxnVqJUUfFcvyP6/fPKAI jgQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=v2bVxgiXyAJb5mJiV2jLmDxYyCHvoUPxWAqoNPpKzrk=; b=df1w2DRLaY+jKvPVo0LBlrKScbtD1nEwyO5rX0dkeGjnFXyNoy+5ZfMeqiV1X0jPRd iK6n4Mf59NLj6ED0EyC6p1g/qVjmZyNx2yqMmcNWziqutj6RLyI4fSUnbRNm4tU+P2uQ MueUBQwjhu69Y0yuJ7P0kAnPFA9BPCNu03l45J2YMAXLVt+7d4puh7oEyODz1YILrNsX NbavIVeC4tGiMe4BozEeG/mciEYmKcnGZWgeudE7ssCjygB5zMtDvoM5641KSLY5NBGz v7obbfhd/+XyiDtxlEOMTyaB9uEgpiKfXBt0AKrHCuQoRQt22vob/1klj3SpcbusbIPB Le9w== X-Gm-Message-State: AOAM53250x/ynsNx3k/cJZcoO0N9ML4GLeIEzOnUpW/vOcnkzh4bQrDy Efv88U9323THVxHoD0EJS0qcag9vao4= X-Google-Smtp-Source: ABdhPJzK0NxEHTV/+8Vw3vNgVsgqjfxWh0bLANQxvybahaqguF0eAOx3NAqxb6gKAlhuKyVWlLv/cA== X-Received: by 2002:a5d:678b:0:b0:20a:db0b:7395 with SMTP id v11-20020a5d678b000000b0020adb0b7395mr20344374wru.668.1652210827794; Tue, 10 May 2022 12:27:07 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q64-20020a1c4343000000b003942a244ec2sm114641wma.7.2022.05.10.12.27.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 May 2022 12:27:07 -0700 (PDT) Message-Id: <45662cf582ab7c8b1c32f55c9a34f4d73a28b71d.1652210824.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 10 May 2022 19:26:58 +0000 Subject: [PATCH v4 1/7] archive: optionally add "virtual" files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Taylor Blau , Derrick Stolee , Elijah Newren , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin With the `--add-file-with-content=:` option, `git archive` now supports use cases where relatively trivial files need to be added that do not exist on disk. This will allow us to generate `.zip` files with generated content, without having to add said content to the object database and without having to write it out to disk. Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- Documentation/git-archive.txt | 11 ++++++++ archive.c | 51 +++++++++++++++++++++++++++++------ t/t5003-archive-zip.sh | 12 +++++++++ 3 files changed, 66 insertions(+), 8 deletions(-) diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt index bc4e76a7834..a0edc9167b2 100644 --- a/Documentation/git-archive.txt +++ b/Documentation/git-archive.txt @@ -61,6 +61,17 @@ OPTIONS by concatenating the value for `--prefix` (if any) and the basename of . +--add-file-with-content=::: + Add the specified contents to the archive. Can be repeated to add + multiple files. The path of the file in the archive is built + by concatenating the value for `--prefix` (if any) and the + basename of . ++ +The `` cannot contain any colon, the file mode is limited to +a regular file, and the option may be subject to platform-dependent +command-line limits. For non-trivial cases, write an untracked file +and use `--add-file` instead. + --worktree-attributes:: Look for attributes in .gitattributes files in the working tree as well (see <>). diff --git a/archive.c b/archive.c index a3bbb091256..d798624cd5f 100644 --- a/archive.c +++ b/archive.c @@ -263,6 +263,7 @@ static int queue_or_write_archive_entry(const struct object_id *oid, struct extra_file_info { char *base; struct stat stat; + void *content; }; int write_archive_entries(struct archiver_args *args, @@ -337,7 +338,13 @@ int write_archive_entries(struct archiver_args *args, strbuf_addstr(&path_in_archive, basename(path)); strbuf_reset(&content); - if (strbuf_read_file(&content, path, info->stat.st_size) < 0) + if (info->content) + err = write_entry(args, &fake_oid, path_in_archive.buf, + path_in_archive.len, + info->stat.st_mode, + info->content, info->stat.st_size); + else if (strbuf_read_file(&content, path, + info->stat.st_size) < 0) err = error_errno(_("could not read '%s'"), path); else err = write_entry(args, &fake_oid, path_in_archive.buf, @@ -493,6 +500,7 @@ static void extra_file_info_clear(void *util, const char *str) { struct extra_file_info *info = util; free(info->base); + free(info->content); free(info); } @@ -514,14 +522,38 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset) if (!arg) return -1; - path = prefix_filename(args->prefix, arg); - item = string_list_append_nodup(&args->extra_files, path); - item->util = info = xmalloc(sizeof(*info)); + info = xmalloc(sizeof(*info)); info->base = xstrdup_or_null(base); - if (stat(path, &info->stat)) - die(_("File not found: %s"), path); - if (!S_ISREG(info->stat.st_mode)) - die(_("Not a regular file: %s"), path); + + if (!strcmp(opt->long_name, "add-file")) { + path = prefix_filename(args->prefix, arg); + if (stat(path, &info->stat)) + die(_("File not found: %s"), path); + if (!S_ISREG(info->stat.st_mode)) + die(_("Not a regular file: %s"), path); + info->content = NULL; /* read the file later */ + } else { + const char *colon = strchr(arg, ':'); + char *p; + + if (!colon) + die(_("missing colon: '%s'"), arg); + + p = xstrndup(arg, colon - arg); + if (!args->prefix) + path = p; + else { + path = prefix_filename(args->prefix, p); + free(p); + } + memset(&info->stat, 0, sizeof(info->stat)); + info->stat.st_mode = S_IFREG | 0644; + info->content = xstrdup(colon + 1); + info->stat.st_size = strlen(info->content); + } + item = string_list_append_nodup(&args->extra_files, path); + item->util = info; + return 0; } @@ -554,6 +586,9 @@ static int parse_archive_args(int argc, const char **argv, { OPTION_CALLBACK, 0, "add-file", args, N_("file"), N_("add untracked file to archive"), 0, add_file_cb, (intptr_t)&base }, + { OPTION_CALLBACK, 0, "add-file-with-content", args, + N_("path:content"), N_("add untracked file to archive"), 0, + add_file_cb, (intptr_t)&base }, OPT_STRING('o', "output", &output, N_("file"), N_("write the archive to this file")), OPT_BOOL(0, "worktree-attributes", &worktree_attributes, diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh index 1e6d18b140e..8ff1257f1a0 100755 --- a/t/t5003-archive-zip.sh +++ b/t/t5003-archive-zip.sh @@ -206,6 +206,18 @@ test_expect_success 'git archive --format=zip --add-file' ' check_zip with_untracked check_added with_untracked untracked untracked +test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' ' + git archive --format=zip >with_file_with_content.zip \ + --add-file-with-content=hello:world $EMPTY_TREE && + test_when_finished "rm -rf tmp-unpack" && + mkdir tmp-unpack && ( + cd tmp-unpack && + "$GIT_UNZIP" ../with_file_with_content.zip && + test_path_is_file hello && + test world = $(cat hello) + ) +' + test_expect_success 'git archive --format=zip --add-file twice' ' echo untracked >untracked && git archive --format=zip --prefix=one/ --add-file=untracked \ From patchwork Tue May 10 19:26:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12845444 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41C5CC433F5 for ; Tue, 10 May 2022 19:27:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231330AbiEJT1U (ORCPT ); Tue, 10 May 2022 15:27:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238248AbiEJT1P (ORCPT ); Tue, 10 May 2022 15:27:15 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AA6B30551 for ; Tue, 10 May 2022 12:27:10 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id 129so10789005wmz.0 for ; Tue, 10 May 2022 12:27:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=yH45y3rNUaaonemIU29IzN0qO3F7ZntSRBJ/LoN3o8s=; b=gOMI27ugYpF5sUNXomvsBobHwpluQYXOvtJM2bZkdozOS+RGikFPtU/QtPcJEqwaqW 0J2z65kMmbUlzlMe9dRVu/gwUYrre7f9BM27KNnVJ4KYe7/KkLoSwlOOCLlCzt7UGVad h8Qbk6IP+/iy7JOZJk/lsWtjIcxYIf/hZX20Y2DRPIn0t/mK7Gfw04c6vvPgHA9Vz7wi 2Tkid3bN6ZzoyvjgvNqQfj4TY+EfF3fWXOsHcmTJBHcSY+kx1FsKGzhflmwOND6U+qM+ SmHwBm+diT68xFw5vsHhvgJKv4FAzxImJnjxHa0VELOABWX/aF+DixEBrQFw/4qZdtkh WFcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=yH45y3rNUaaonemIU29IzN0qO3F7ZntSRBJ/LoN3o8s=; b=8IzG6iF7XSKsz2yEpAJglgeb9kiHaaTCGeufyoMA9lfJ5H0kDZOCl7ZrfKE+yolPJC brHA1Pi0GdE6UZgj2gHJpwzAcgel+SLuI+i92GFWiIIbAohXvy6Zq4V+n5UDD98ULhg2 i4LEXoXlcU6hlas7I0WGOWTO8mpVqPBod2xJy9RDDN2c5NCKkBVhIEKk9WjGTHp+gc/h RBSljUA2dspqUqmWhxvI6U1LDuHnL+FeJqAKpLrvfoAefyuc3tZYI9wVfP5yFlS7fYuf LrophX/fRXImH48vUmAjBSoC6G7NuQnxBMolSNrQyKLnn/AarqgEsW3rX5YvGtmgkRmB YV9w== X-Gm-Message-State: AOAM530EdzOuOYP4Lk+mYD49trVkFnxDAMnJZk8g/iBo1X/Fi5YwDD1h pV2AKIswdA/+ipV3W91r0heGq6BbdCo= X-Google-Smtp-Source: ABdhPJzfukqXxj5vKt1wvmw6bv15W3D2kv+5WC8Z7/4Z7+2fQ2d37BbY81+VC36dBBnpKKOLJqiPRg== X-Received: by 2002:a1c:4e08:0:b0:393:fd06:c2ce with SMTP id g8-20020a1c4e08000000b00393fd06c2cemr1435599wmh.91.1652210829108; Tue, 10 May 2022 12:27:09 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id ay7-20020a05600c1e0700b003945781b725sm3589690wmb.37.2022.05.10.12.27.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 May 2022 12:27:08 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 10 May 2022 19:26:59 +0000 Subject: [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Taylor Blau , Derrick Stolee , Elijah Newren , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin By allowing the path to be enclosed in double-quotes, we can avoid the limitation that paths cannot contain colons. Signed-off-by: Johannes Schindelin --- Documentation/git-archive.txt | 14 ++++++++++---- archive.c | 30 ++++++++++++++++++++---------- t/t5003-archive-zip.sh | 8 ++++++++ 3 files changed, 38 insertions(+), 14 deletions(-) diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt index a0edc9167b2..21eab5690ad 100644 --- a/Documentation/git-archive.txt +++ b/Documentation/git-archive.txt @@ -67,10 +67,16 @@ OPTIONS by concatenating the value for `--prefix` (if any) and the basename of . + -The `` cannot contain any colon, the file mode is limited to -a regular file, and the option may be subject to platform-dependent -command-line limits. For non-trivial cases, write an untracked file -and use `--add-file` instead. +The `` argument can start and end with a literal double-quote +character; The contained file name is interpreted as a C-style string, +i.e. the backslash is interpreted as escape character. The path must +be quoted if it contains a colon, to avoid the colon from being +misinterpreted as the separator between the path and the contents, or +if the path begins or ends with a double-quote character. ++ +The file mode is limited to a regular file, and the option may be +subject to platform-dependent command-line limits. For non-trivial +cases, write an untracked file and use `--add-file` instead. --worktree-attributes:: Look for attributes in .gitattributes files in the working tree diff --git a/archive.c b/archive.c index d798624cd5f..477eba60ac3 100644 --- a/archive.c +++ b/archive.c @@ -9,6 +9,7 @@ #include "parse-options.h" #include "unpack-trees.h" #include "dir.h" +#include "quote.h" static char const * const archive_usage[] = { N_("git archive [] [...]"), @@ -533,22 +534,31 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset) die(_("Not a regular file: %s"), path); info->content = NULL; /* read the file later */ } else { - const char *colon = strchr(arg, ':'); - char *p; + struct strbuf buf = STRBUF_INIT; + const char *p = arg; + + if (*p != '"') + p = strchr(p, ':'); + else if (unquote_c_style(&buf, p, &p) < 0) + die(_("unclosed quote: '%s'"), arg); - if (!colon) + if (!p || *p != ':') die(_("missing colon: '%s'"), arg); - p = xstrndup(arg, colon - arg); - if (!args->prefix) - path = p; - else { - path = prefix_filename(args->prefix, p); - free(p); + if (p == arg) + die(_("empty file name: '%s'"), arg); + + path = buf.len ? + strbuf_detach(&buf, NULL) : xstrndup(arg, p - arg); + + if (args->prefix) { + char *save = path; + path = prefix_filename(args->prefix, path); + free(save); } memset(&info->stat, 0, sizeof(info->stat)); info->stat.st_mode = S_IFREG | 0644; - info->content = xstrdup(colon + 1); + info->content = xstrdup(p + 1); info->stat.st_size = strlen(info->content); } item = string_list_append_nodup(&args->extra_files, path); diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh index 8ff1257f1a0..5b8bbfc2692 100755 --- a/t/t5003-archive-zip.sh +++ b/t/t5003-archive-zip.sh @@ -207,13 +207,21 @@ check_zip with_untracked check_added with_untracked untracked untracked test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' ' + if test_have_prereq FUNNYNAMES + then + QUOTED=quoted:colon + else + QUOTED=quoted + fi && git archive --format=zip >with_file_with_content.zip \ + --add-file-with-content=\"$QUOTED\": \ --add-file-with-content=hello:world $EMPTY_TREE && test_when_finished "rm -rf tmp-unpack" && mkdir tmp-unpack && ( cd tmp-unpack && "$GIT_UNZIP" ../with_file_with_content.zip && test_path_is_file hello && + test_path_is_file $QUOTED && test world = $(cat hello) ) ' From patchwork Tue May 10 19:27:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12845445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B60AFC433EF for ; Tue, 10 May 2022 19:27:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234790AbiEJT1V (ORCPT ); Tue, 10 May 2022 15:27:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239222AbiEJT1R (ORCPT ); Tue, 10 May 2022 15:27:17 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D28A924091 for ; Tue, 10 May 2022 12:27:12 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id i5so25143700wrc.13 for ; Tue, 10 May 2022 12:27:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Uo6VeR463A8bpTnDe4txbUadyp7c2hS7RwvOwDAMeOI=; b=e+yaCp2AWVyPT9Qwax0nNgC6nDMBTNHjoL7sx7vWZucrpMtXMmBj3aMTH/yUNyMAqt XGUCJgCNpAke6HHcovapI8quOEy40l/GfZ2ogs1OT0yQJNpv1kdLYunBdCuTHFZtgHWK v9HhBiWmey0kjqtjwOqg8zjJjpkeoCd8gphYcGDVYMtluNN73k3Qisrlqa1nd9IoLEE+ UNp9J37cbLK2cs8kMu+OQWAVg9NlzClDbTn/jtuDKLuIjxY05fLE1mOygRhNdsHgZWcL VpC3iPv9xx8PWLw8GTaJe2viDpbiKLhAGZKWf/sDHvnrT9datXHxWXnH+Zi7TcW7Bd9L ynAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Uo6VeR463A8bpTnDe4txbUadyp7c2hS7RwvOwDAMeOI=; b=jPPJWdDWjCCjgKF8ekN2WjRrKr0/vDwW2nkDP9UvTFgUgwHO3SWYAqGw0H7OaVUjNJ Wg8ZogLMWISk/9B3/2lxlCqcBulN9Lz+VggzIurmFsAJa1Cb6g51Tqr0R6h3YGTe1l0K r7LNLE01mVkJloT9vc3iaz1NEa4fFkE77lmJPm3lpt1Rl6E5MyHFjCKCQ6LB0Y9/PJcs 9zozHPwoOpxJid4Jk/pAU2tG6d3bHUnRk1KhPiTsYkiiR4c3QXb7y4ILQKn5HywVx/ao sVSUqXzkf7s4GKS9pZxlzca5/5Fxt7C4nSLtK+zsQoSNHwV+gofLvm+K4fE9Rwi+RrpG JhFg== X-Gm-Message-State: AOAM5301zxULN5MLQ5XIc1iQ4TCkhEuJSs9nhDUZG05sMIsR6njyLkF9 w1jrIAmnh9FrnHQR7TDLspycyKvf5N0= X-Google-Smtp-Source: ABdhPJwiWiuS5Sfj8W38NgVSFxtq49VAH18r36TjjPtQ+qBc0ef2dxd37ENgcyPaFr5XzI+qDl2dUg== X-Received: by 2002:a05:6000:707:b0:20c:4fd8:1d61 with SMTP id bs7-20020a056000070700b0020c4fd81d61mr20421118wrb.407.1652210830875; Tue, 10 May 2022 12:27:10 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d12-20020adffd8c000000b0020c5253d8f2sm14324436wrr.62.2022.05.10.12.27.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 May 2022 12:27:09 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 10 May 2022 19:27:00 +0000 Subject: [PATCH v4 3/7] scalar: validate the optional enlistment argument Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Taylor Blau , Derrick Stolee , Elijah Newren , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin The `scalar` command needs a Scalar enlistment for many subcommands, and looks in the current directory for such an enlistment (traversing the parent directories until it finds one). These is subcommands can also be called with an optional argument specifying the enlistment. Here, too, we traverse parent directories as needed, until we find an enlistment. However, if the specified directory does not even exist, or is not a directory, we should stop right there, with an error message. Signed-off-by: Johannes Schindelin --- contrib/scalar/scalar.c | 6 ++++-- contrib/scalar/t/t9099-scalar.sh | 5 +++++ 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c index 1ce9c2b00e8..00dcd4b50ef 100644 --- a/contrib/scalar/scalar.c +++ b/contrib/scalar/scalar.c @@ -43,9 +43,11 @@ static void setup_enlistment_directory(int argc, const char **argv, usage_with_options(usagestr, options); /* find the worktree, determine its corresponding root */ - if (argc == 1) + if (argc == 1) { strbuf_add_absolute_path(&path, argv[0]); - else if (strbuf_getcwd(&path) < 0) + if (!is_directory(path.buf)) + die(_("'%s' does not exist"), path.buf); + } else if (strbuf_getcwd(&path) < 0) die(_("need a working directory")); strbuf_trim_trailing_dir_sep(&path); diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh index 2e1502ad45e..9d83fdf25e8 100755 --- a/contrib/scalar/t/t9099-scalar.sh +++ b/contrib/scalar/t/t9099-scalar.sh @@ -85,4 +85,9 @@ test_expect_success 'scalar delete with enlistment' ' test_path_is_missing cloned ' +test_expect_success '`scalar [...] ` errors out when dir is missing' ' + ! scalar run config cloned 2>err && + grep "cloned. does not exist" err +' + test_done From patchwork Tue May 10 19:27:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12845446 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9446FC433FE for ; Tue, 10 May 2022 19:27:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239222AbiEJT1Z (ORCPT ); Tue, 10 May 2022 15:27:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234816AbiEJT1S (ORCPT ); Tue, 10 May 2022 15:27:18 -0400 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E9AB36156 for ; Tue, 10 May 2022 12:27:15 -0700 (PDT) Received: by mail-wm1-x331.google.com with SMTP id p189so10781014wmp.3 for ; Tue, 10 May 2022 12:27:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=jxReg2YKrDvXt32NQ6dPIJlwTPUNqg837mpQhjNd/pA=; b=GL9IEeloFMqMFqPop6S17AqM2k24Y2a67/4nBTLNk+6eyUCF7XHaRKruqo8fSIOYDb O/KU/Ozydh0rHUcLTXhZjwG4E7gwz5j+r2xiI6EFRf2CN29X5FT9I/p9aGPMlXXTrQaK AiJ3zl6J1Hf5NkkNn1XhQkJVLFa4ZFaKt19cDjovqXYSymXz3tpLNT523RSBZF5qBFev Wa1blIerf1lsOVYAG/RXSGZ04rgzgKKuDPm3ZJMSyFG8VRpVVzOHHFlwOaO9+phjBojp jwC54dgK4CbMxBbVWHztun9FdxZdZvLK0saQxR5xLnRRTDtq8zw25vTXtd9p+vpmTL/f LaKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=jxReg2YKrDvXt32NQ6dPIJlwTPUNqg837mpQhjNd/pA=; b=QLjg0svIPtVO4AnZEAsB9YL+BZeQ6z7PQk1TUBu/oziF2jBJSWa3aDZF5ZYE8s9PGI DAnk+kfq6YDG23PaJSTr9zS0GUEF1P+TzaYOGLst8D4KyQIsfIx4PtIuql4BU1/ZzZG8 UtH4aa18TejxGEUoFzbSuBVBhhzHZWfau8VJaPznoC8DYywJi364pQfxtf75a9feqUXi fPpDoYYt2+9jIdXzmMmoPLlBoY4UQfqJN/bZLF82dxSbrdvELtK2AX68G1rOgHK/EIcf kS7D9Q/J12bxSROb5o7s4iE6Rbqu+0hy4UU55eFWAtwVEIq3S3dXw9Cq4ZRo77+Gj9g7 aiFw== X-Gm-Message-State: AOAM533OV5WG2HTnr9EInzXpliRhXkhlHIbKC5S0T2LAbUTetWIbqUhQ CjYsAG3JVjKK9p6sDNrQ73VBz+O6aGY= X-Google-Smtp-Source: ABdhPJxTwLwO5TpmIAQ7qzBoDKdfBRwhWZsxNvZuBn7FWLdtEhe7S5jem68iKE58df8MePZ9ss8Ibg== X-Received: by 2002:a1c:35c2:0:b0:38e:c75d:90a3 with SMTP id c185-20020a1c35c2000000b0038ec75d90a3mr1439679wma.98.1652210833023; Tue, 10 May 2022 12:27:13 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m189-20020a1ca3c6000000b003942a244ed7sm61506wme.28.2022.05.10.12.27.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 May 2022 12:27:11 -0700 (PDT) Message-Id: <87bdc22322b0f58bf153b963207cffe4f41c9ae9.1652210824.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 10 May 2022 19:27:01 +0000 Subject: [PATCH v4 4/7] Implement `scalar diagnose` Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Taylor Blau , Derrick Stolee , Elijah Newren , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin Over the course of Scalar's development, it became obvious that there is a need for a command that can gather all kinds of useful information that can help identify the most typical problems with large worktrees/repositories. The `diagnose` command is the culmination of this hard-won knowledge: it gathers the installed hooks, the config, a couple statistics describing the data shape, among other pieces of information, and then wraps everything up in a tidy, neat `.zip` archive. Note: originally, Scalar was implemented in C# using the .NET API, where we had the luxury of a comprehensive standard library that includes basic functionality such as writing a `.zip` file. In the C version, we lack such a commodity. Rather than introducing a dependency on, say, libzip, we slightly abuse Git's `archive` machinery: we write out a `.zip` of the empty try, augmented by a couple files that are added via the `--add-file*` options. We are careful trying not to modify the current repository in any way lest the very circumstances that required `scalar diagnose` to be run are changed by the `diagnose` run itself. Signed-off-by: Johannes Schindelin --- contrib/scalar/scalar.c | 144 +++++++++++++++++++++++++++++++ contrib/scalar/scalar.txt | 12 +++ contrib/scalar/t/t9099-scalar.sh | 14 +++ 3 files changed, 170 insertions(+) diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c index 00dcd4b50ef..367a2c50e25 100644 --- a/contrib/scalar/scalar.c +++ b/contrib/scalar/scalar.c @@ -11,6 +11,7 @@ #include "dir.h" #include "packfile.h" #include "help.h" +#include "archive.h" /* * Remove the deepest subdirectory in the provided path string. Path must not @@ -261,6 +262,47 @@ static int unregister_dir(void) return res; } +static int add_directory_to_archiver(struct strvec *archiver_args, + const char *path, int recurse) +{ + int at_root = !*path; + DIR *dir = opendir(at_root ? "." : path); + struct dirent *e; + struct strbuf buf = STRBUF_INIT; + size_t len; + int res = 0; + + if (!dir) + return error(_("could not open directory '%s'"), path); + + if (!at_root) + strbuf_addf(&buf, "%s/", path); + len = buf.len; + strvec_pushf(archiver_args, "--prefix=%s", buf.buf); + + while (!res && (e = readdir(dir))) { + if (!strcmp(".", e->d_name) || !strcmp("..", e->d_name)) + continue; + + strbuf_setlen(&buf, len); + strbuf_addstr(&buf, e->d_name); + + if (e->d_type == DT_REG) + strvec_pushf(archiver_args, "--add-file=%s", buf.buf); + else if (e->d_type != DT_DIR) + warning(_("skipping '%s', which is neither file nor " + "directory"), buf.buf); + else if (recurse && + add_directory_to_archiver(archiver_args, + buf.buf, recurse) < 0) + res = -1; + } + + closedir(dir); + strbuf_release(&buf); + return res; +} + /* printf-style interface, expects `=` argument */ static int set_config(const char *fmt, ...) { @@ -501,6 +543,107 @@ cleanup: return res; } +static int cmd_diagnose(int argc, const char **argv) +{ + struct option options[] = { + OPT_END(), + }; + const char * const usage[] = { + N_("scalar diagnose []"), + NULL + }; + struct strbuf zip_path = STRBUF_INIT; + struct strvec archiver_args = STRVEC_INIT; + char **argv_copy = NULL; + int stdout_fd = -1, archiver_fd = -1; + time_t now = time(NULL); + struct tm tm; + struct strbuf path = STRBUF_INIT, buf = STRBUF_INIT; + int res = 0; + + argc = parse_options(argc, argv, NULL, options, + usage, 0); + + setup_enlistment_directory(argc, argv, usage, options, &zip_path); + + strbuf_addstr(&zip_path, "/.scalarDiagnostics/scalar_"); + strbuf_addftime(&zip_path, + "%Y%m%d_%H%M%S", localtime_r(&now, &tm), 0, 0); + strbuf_addstr(&zip_path, ".zip"); + switch (safe_create_leading_directories(zip_path.buf)) { + case SCLD_EXISTS: + case SCLD_OK: + break; + default: + error_errno(_("could not create directory for '%s'"), + zip_path.buf); + goto diagnose_cleanup; + } + stdout_fd = dup(1); + if (stdout_fd < 0) { + res = error_errno(_("could not duplicate stdout")); + goto diagnose_cleanup; + } + + archiver_fd = xopen(zip_path.buf, O_CREAT | O_WRONLY | O_TRUNC, 0666); + if (archiver_fd < 0 || dup2(archiver_fd, 1) < 0) { + res = error_errno(_("could not redirect output")); + goto diagnose_cleanup; + } + + init_zip_archiver(); + strvec_pushl(&archiver_args, "scalar-diagnose", "--format=zip", NULL); + + strbuf_reset(&buf); + strbuf_addstr(&buf, "Collecting diagnostic info\n\n"); + get_version_info(&buf, 1); + + strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree); + write_or_die(stdout_fd, buf.buf, buf.len); + strvec_pushf(&archiver_args, + "--add-file-with-content=diagnostics.log:%.*s", + (int)buf.len, buf.buf); + + if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) || + (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) || + (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) || + (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) || + (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0))) + goto diagnose_cleanup; + + strvec_pushl(&archiver_args, "--prefix=", + oid_to_hex(the_hash_algo->empty_tree), "--", NULL); + + /* `write_archive()` modifies the `argv` passed to it. Let it. */ + argv_copy = xmemdupz(archiver_args.v, + sizeof(char *) * archiver_args.nr); + res = write_archive(archiver_args.nr, (const char **)argv_copy, NULL, + the_repository, NULL, 0); + if (res) { + error(_("failed to write archive")); + goto diagnose_cleanup; + } + + if (!res) + fprintf(stderr, "\n" + "Diagnostics complete.\n" + "All of the gathered info is captured in '%s'\n", + zip_path.buf); + +diagnose_cleanup: + if (archiver_fd >= 0) { + close(1); + dup2(stdout_fd, 1); + } + free(argv_copy); + strvec_clear(&archiver_args); + strbuf_release(&zip_path); + strbuf_release(&path); + strbuf_release(&buf); + + return res; +} + static int cmd_list(int argc, const char **argv) { if (argc != 1) @@ -802,6 +945,7 @@ static struct { { "reconfigure", cmd_reconfigure }, { "delete", cmd_delete }, { "version", cmd_version }, + { "diagnose", cmd_diagnose }, { NULL, NULL}, }; diff --git a/contrib/scalar/scalar.txt b/contrib/scalar/scalar.txt index f416d637289..22583fe046e 100644 --- a/contrib/scalar/scalar.txt +++ b/contrib/scalar/scalar.txt @@ -14,6 +14,7 @@ scalar register [] scalar unregister [] scalar run ( all | config | commit-graph | fetch | loose-objects | pack-files ) [] scalar reconfigure [ --all | ] +scalar diagnose [] scalar delete DESCRIPTION @@ -129,6 +130,17 @@ reconfigure the enlistment. With the `--all` option, all enlistments currently registered with Scalar will be reconfigured. Use this option after each Scalar upgrade. +Diagnose +~~~~~~~~ + +diagnose []:: + When reporting issues with Scalar, it is often helpful to provide the + information gathered by this command, including logs and certain + statistics describing the data shape of the current enlistment. ++ +The output of this command is a `.zip` file that is written into +a directory adjacent to the worktree in the `src` directory. + Delete ~~~~~~ diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh index 9d83fdf25e8..6802d317258 100755 --- a/contrib/scalar/t/t9099-scalar.sh +++ b/contrib/scalar/t/t9099-scalar.sh @@ -90,4 +90,18 @@ test_expect_success '`scalar [...] ` errors out when dir is missing' ' grep "cloned. does not exist" err ' +SQ="'" +test_expect_success UNZIP 'scalar diagnose' ' + scalar clone "file://$(pwd)" cloned --single-branch && + scalar diagnose cloned >out 2>err && + sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" zip_path && + zip_path=$(cat zip_path) && + test -n "$zip_path" && + unzip -v "$zip_path" && + folder=${zip_path%.zip} && + test_path_is_missing "$folder" && + unzip -p "$zip_path" diagnostics.log >out && + test_file_not_empty out +' + test_done From patchwork Tue May 10 19:27:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12845448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7295EC433F5 for ; Tue, 10 May 2022 19:27:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238841AbiEJT1b (ORCPT ); Tue, 10 May 2022 15:27:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239805AbiEJT1S (ORCPT ); Tue, 10 May 2022 15:27:18 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3CD5C3616B for ; Tue, 10 May 2022 12:27:15 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id d5so8072wrb.6 for ; Tue, 10 May 2022 12:27:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=FJ0VvqP8x2j0FbHa0XDUiAhX7eAX+nOlgM67ZhMwLss=; b=alYqXdoz0W4CSRa3xDK2B4vwS167cMjw8Bc0zzyDi/7DkvS235OxL5D58vEalFihXI /l3lrqc1XFK+JbsyCxWBGuPImdqvnMc98xRxDuct5FHz6qjYRddItFuL79P7nE2+G25B +SEJC8DBaV9DZ73HNpUj6PG706xr7oIC0qlsnAN56OVBdX8/ZoB9f/tt7KtlcXtk4+Pn 3Pwh6u+AmZbkZL+wWEQLM8Y45DrRHxtg8Mw6BeJWUCdm8I0Br5Cb2xBbBIy9UO/mPCMj 5vvbAej+mjX+OoDr4329xsa1LR7rgH06m76nuAfOO2kW1bimAmZlhWDGDH37VtQnu3Ij BuJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=FJ0VvqP8x2j0FbHa0XDUiAhX7eAX+nOlgM67ZhMwLss=; b=Q0JV0Cm9j7HXpAjR7NCQu1GmmA05GieFA39qfPfzPfNBESCezvUQKhYky9SOOn8NNz nltHvhyLTkJB1FfCTSLY1Ik12PhOitLqCr44cWFfW91LJyc+RAqljjHcq5otcZLdLPAh j5s2ooiC7NZi3bi+oF8ICOe0soxxJTsqxchFtrtbLUaen5waW51Q8toXCIqdMkeOorCu 7Mowdgzsy4/eYy1TgXJRHf7cjlFm2kkPI/jFfDyFpfH+MuLIOeGsry6sbXQxP9SENkU1 a0CgRQNCItYMECh1M8abCmfBzHUzNEOvqePR+hIebPrF3Vk9/MzsjdmWsTaq/odaxyD9 /gtw== X-Gm-Message-State: AOAM533R8WmJATl+Fgl8aqEJh/D0pdrEtP12z1REarUvcK90WviF4Wu/ 44ObgbX4Ch7UXYzt+Qh+3rJZ4m1fyAo= X-Google-Smtp-Source: ABdhPJxE/KFvCzNoJJXyAb5HYeUqcZ9weWTw7ohQWn5XdThHKFJQ7FkRDGxGkr4nwV67eqf7qtxZJA== X-Received: by 2002:a5d:5051:0:b0:20a:e005:cca3 with SMTP id h17-20020a5d5051000000b0020ae005cca3mr19394365wrt.560.1652210834537; Tue, 10 May 2022 12:27:14 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c18-20020a05600c0a5200b003942a244ec8sm106147wmq.13.2022.05.10.12.27.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 May 2022 12:27:13 -0700 (PDT) Message-Id: <3f63b197d420c35f7606d823a981b067964876a6.1652210824.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 10 May 2022 19:27:02 +0000 Subject: [PATCH v4 5/7] scalar diagnose: include disk space information Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Taylor Blau , Derrick Stolee , Elijah Newren , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin When analyzing problems with large worktrees/repositories, it is useful to know how close to a "full disk" situation Scalar/Git operates. Let's include this information. Signed-off-by: Johannes Schindelin --- contrib/scalar/scalar.c | 53 ++++++++++++++++++++++++++++++++ contrib/scalar/t/t9099-scalar.sh | 1 + 2 files changed, 54 insertions(+) diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c index 367a2c50e25..34cbec59b45 100644 --- a/contrib/scalar/scalar.c +++ b/contrib/scalar/scalar.c @@ -303,6 +303,58 @@ static int add_directory_to_archiver(struct strvec *archiver_args, return res; } +#ifndef WIN32 +#include +#endif + +static int get_disk_info(struct strbuf *out) +{ +#ifdef WIN32 + struct strbuf buf = STRBUF_INIT; + char volume_name[MAX_PATH], fs_name[MAX_PATH]; + DWORD serial_number, component_length, flags; + ULARGE_INTEGER avail2caller, total, avail; + + strbuf_realpath(&buf, ".", 1); + if (!GetDiskFreeSpaceExA(buf.buf, &avail2caller, &total, &avail)) { + error(_("could not determine free disk size for '%s'"), + buf.buf); + strbuf_release(&buf); + return -1; + } + + strbuf_setlen(&buf, offset_1st_component(buf.buf)); + if (!GetVolumeInformationA(buf.buf, volume_name, sizeof(volume_name), + &serial_number, &component_length, &flags, + fs_name, sizeof(fs_name))) { + error(_("could not get info for '%s'"), buf.buf); + strbuf_release(&buf); + return -1; + } + strbuf_addf(out, "Available space on '%s': ", buf.buf); + strbuf_humanise_bytes(out, avail2caller.QuadPart); + strbuf_addch(out, '\n'); + strbuf_release(&buf); +#else + struct strbuf buf = STRBUF_INIT; + struct statvfs stat; + + strbuf_realpath(&buf, ".", 1); + if (statvfs(buf.buf, &stat) < 0) { + error_errno(_("could not determine free disk size for '%s'"), + buf.buf); + strbuf_release(&buf); + return -1; + } + + strbuf_addf(out, "Available space on '%s': ", buf.buf); + strbuf_humanise_bytes(out, st_mult(stat.f_bsize, stat.f_bavail)); + strbuf_addf(out, " (mount flags 0x%lx)\n", stat.f_flag); + strbuf_release(&buf); +#endif + return 0; +} + /* printf-style interface, expects `=` argument */ static int set_config(const char *fmt, ...) { @@ -599,6 +651,7 @@ static int cmd_diagnose(int argc, const char **argv) get_version_info(&buf, 1); strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree); + get_disk_info(&buf); write_or_die(stdout_fd, buf.buf, buf.len); strvec_pushf(&archiver_args, "--add-file-with-content=diagnostics.log:%.*s", diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh index 6802d317258..934b2485d91 100755 --- a/contrib/scalar/t/t9099-scalar.sh +++ b/contrib/scalar/t/t9099-scalar.sh @@ -94,6 +94,7 @@ SQ="'" test_expect_success UNZIP 'scalar diagnose' ' scalar clone "file://$(pwd)" cloned --single-branch && scalar diagnose cloned >out 2>err && + grep "Available space" out && sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" zip_path && zip_path=$(cat zip_path) && test -n "$zip_path" && From patchwork Tue May 10 19:27:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew John Cheetham X-Patchwork-Id: 12845447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3BD9C433EF for ; Tue, 10 May 2022 19:27:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231430AbiEJT12 (ORCPT ); Tue, 10 May 2022 15:27:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240004AbiEJT1S (ORCPT ); Tue, 10 May 2022 15:27:18 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32BA02A715 for ; Tue, 10 May 2022 12:27:17 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id d5so8145wrb.6 for ; Tue, 10 May 2022 12:27:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=T42C+4SBbpPKC+s/Z6/TtR9YCHC68Sfy/m5Tfi8ZBiA=; b=KSBvZFpWyGCXhmSR8M1QDt/De/KCZv/Y5fDjVyN1m0l+uVCz2tOdV81Mz0ItciteRP gCW7C/xvzuI93uw12KYdtnwmnTC5mw5vmq5uCzzEBJMK/TI2/RFnvocInxRZWjQ80hud 1dn/+L3cMzq4CIFk/yqSkpIm6StOqNbTVCUhrHWoemYkC8y3GE/vKL6+LbkZVPTui+ca 2vsSR8zrsr3cheM54wabMdHnD4VXXDE8eRD1C/n8rae95iLxIDN4hkbMx4BR9M0ZHx2i TCYzRg6YcZfSeb0pcuGq1hF07jqkqvl9HRqg4pbbA51+SNrBetUK15SpcSu08cv+sPX/ FdhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=T42C+4SBbpPKC+s/Z6/TtR9YCHC68Sfy/m5Tfi8ZBiA=; b=rOfzixtK21/dqY4Nwi/wtsqW19irDYn5qOuxmT5jCtvZm5kZCthrkbhYXDOEzHw7XK DdLUOsAtGoOD8t7rjLxuR2HNehlM1DL5KVMjsGoldM+hipy059Sh9kNEswlGb2pPS+X/ GuFA4IfbZK3dX/+NL9UUU7i0SbQIEeDIWpnyEqBzRCibS78yqUEByyfYquedw8Hrgfrm GVSAGkvGaHXTL+suxaM93ItRRAWbEnDOJbC46j7xvZG1KZVvPuPphchNuT//gy/zsEYK Fbz4PCYq9bXqB7zhjR9fxMGPLv1oLy+Cv5YJtw/4abwEgT1VyodMgmb/0Oj4EIseuBN5 AYIg== X-Gm-Message-State: AOAM531R7puwmti1SPV2o0QfjOAzG1SxFz6/UOmUtIX9j58PUvyO17lH z727D3p/lznEtrEX31G6zghRWfivAmE= X-Google-Smtp-Source: ABdhPJxfJYKEciCb1JB+6iquHMfJQOqFkolOMjiEG02Fwovqy/BBmEP+aNNQM7/yqbC736bP4gEh1A== X-Received: by 2002:adf:d1e3:0:b0:20c:6684:9b10 with SMTP id g3-20020adfd1e3000000b0020c66849b10mr20450507wrd.53.1652210836436; Tue, 10 May 2022 12:27:16 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r10-20020adfa14a000000b0020cd0762f37sm5263512wrr.107.2022.05.10.12.27.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 May 2022 12:27:15 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 10 May 2022 19:27:03 +0000 Subject: [PATCH v4 6/7] scalar: teach `diagnose` to gather packfile info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Taylor Blau , Derrick Stolee , Elijah Newren , Johannes Schindelin , Matthew John Cheetham Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matthew John Cheetham From: Matthew John Cheetham It's helpful to see if there are other crud files in the pack directory. Let's teach the `scalar diagnose` command to gather file size information about pack files. While at it, also enumerate the pack files in the alternate object directories, if any are registered. Signed-off-by: Matthew John Cheetham Signed-off-by: Johannes Schindelin --- contrib/scalar/scalar.c | 30 ++++++++++++++++++++++++++++++ contrib/scalar/t/t9099-scalar.sh | 6 +++++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c index 34cbec59b45..e8e0a5ec473 100644 --- a/contrib/scalar/scalar.c +++ b/contrib/scalar/scalar.c @@ -12,6 +12,7 @@ #include "packfile.h" #include "help.h" #include "archive.h" +#include "object-store.h" /* * Remove the deepest subdirectory in the provided path string. Path must not @@ -595,6 +596,29 @@ cleanup: return res; } +static void dir_file_stats_objects(const char *full_path, size_t full_path_len, + const char *file_name, void *data) +{ + struct strbuf *buf = data; + struct stat st; + + if (!stat(full_path, &st)) + strbuf_addf(buf, "%-70s %16" PRIuMAX "\n", file_name, + (uintmax_t)st.st_size); +} + +static int dir_file_stats(struct object_directory *object_dir, void *data) +{ + struct strbuf *buf = data; + + strbuf_addf(buf, "Contents of %s:\n", object_dir->path); + + for_each_file_in_pack_dir(object_dir->path, dir_file_stats_objects, + data); + + return 0; +} + static int cmd_diagnose(int argc, const char **argv) { struct option options[] = { @@ -657,6 +681,12 @@ static int cmd_diagnose(int argc, const char **argv) "--add-file-with-content=diagnostics.log:%.*s", (int)buf.len, buf.buf); + strbuf_reset(&buf); + strbuf_addstr(&buf, "--add-file-with-content=packs-local.txt:"); + dir_file_stats(the_repository->objects->odb, &buf); + foreach_alt_odb(dir_file_stats, &buf); + strvec_push(&archiver_args, buf.buf); + if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) || (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) || (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) || diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh index 934b2485d91..3dd5650cceb 100755 --- a/contrib/scalar/t/t9099-scalar.sh +++ b/contrib/scalar/t/t9099-scalar.sh @@ -93,6 +93,8 @@ test_expect_success '`scalar [...] ` errors out when dir is missing' ' SQ="'" test_expect_success UNZIP 'scalar diagnose' ' scalar clone "file://$(pwd)" cloned --single-branch && + git repack && + echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates && scalar diagnose cloned >out 2>err && grep "Available space" out && sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" zip_path && @@ -102,7 +104,9 @@ test_expect_success UNZIP 'scalar diagnose' ' folder=${zip_path%.zip} && test_path_is_missing "$folder" && unzip -p "$zip_path" diagnostics.log >out && - test_file_not_empty out + test_file_not_empty out && + unzip -p "$zip_path" packs-local.txt >out && + grep "$(pwd)/.git/objects" out ' test_done From patchwork Tue May 10 19:27:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew John Cheetham X-Patchwork-Id: 12845450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99E68C433EF for ; Tue, 10 May 2022 19:27:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240905AbiEJT1i (ORCPT ); Tue, 10 May 2022 15:27:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232327AbiEJT1V (ORCPT ); Tue, 10 May 2022 15:27:21 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE3AA2FFE1 for ; Tue, 10 May 2022 12:27:19 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id k126so10793700wme.2 for ; Tue, 10 May 2022 12:27:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=S5kaxr+/EdXJGKOsHbTN4MuYruvA2b0L7QgcHc8PT3c=; b=IsRWGK/VpgRhHnvWhjBKWFmH5pvjQ6TqBNPfo61yLA8l27r5S8TDRQ+qthiBcdhN6I E1LDXY+O7ViTVXQwDZUwuLQ3H8TD5UCYDpj1PqywV47A94EE2lDQxEteucXrRSTGXsxx 8+ik5PJL4nnAm6ZFadivLd7p5iVkN+E5XVnopK1kQoA0GNfeMAEpHlRuSo7t3f8a1MyY T2VOUayap3viSDdivKKwKjblbZ3h+BRdce797mZ6hWLEiNHSGoBtfpvk4G4Sw6j93Pnd z9Vfl5KTKGU4xnnQz7adfIkRDwZDswEK/fo9MvtbsL0yDsfZ+HxgX0MNU70tqxGRJvae 2Nyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=S5kaxr+/EdXJGKOsHbTN4MuYruvA2b0L7QgcHc8PT3c=; b=FFbPxD/B0Yh9lk6bebSIrrmInjfcfFJx/pz+hI/cO9ITqCsvp6kMquZI9mAABQcLad 8lHDS+WAopDk8i/CljfnN9/wMBFmcnmh0UOGJsG5bweSfK8AYIAZoKhkKTXkBfgpxeQ7 htl01ulTwkU4GudivNnWSwI7xw+5LBhC8FCiRBUS0U9XAxByVgs08YNCcKWd3fD4HqK+ 9TIXZdVPKmew5VZkCOvulnGW8cDq+CvGj7GITEASENQnRkbiuRB3Prm4M84CWsrVo7iV dzeK2hlOFZ8mgFFt5d3xc5fcoyyTF6yWw1dqHdvJEszxCyU4WbzIuejUOuah/T7p7Bq6 uNRQ== X-Gm-Message-State: AOAM531dukKSW1A1ZeHkyg2vBdwncBESDgGJHMay4+ld5iM7wXKb1IUk 3DeNckto8GCzq6KWjYXMRvW4wtsRzWM= X-Google-Smtp-Source: ABdhPJyx/csIfAOm0qQBTVEn5dvHay5A8JxT7TD0Fj2DAluDnbM/3yqpiDUcyLePkeRy3ybk/ozogw== X-Received: by 2002:a7b:cb57:0:b0:393:db11:52ad with SMTP id v23-20020a7bcb57000000b00393db1152admr1397692wmj.143.1652210838140; Tue, 10 May 2022 12:27:18 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y19-20020a1c4b13000000b003945237fea1sm147532wma.0.2022.05.10.12.27.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 May 2022 12:27:17 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 10 May 2022 19:27:04 +0000 Subject: [PATCH v4 7/7] scalar: teach `diagnose` to gather loose objects information Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Taylor Blau , Derrick Stolee , Elijah Newren , Johannes Schindelin , Matthew John Cheetham Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matthew John Cheetham From: Matthew John Cheetham When operating at the scale that Scalar wants to support, certain data shapes are more likely to cause undesirable performance issues, such as large numbers of loose objects. By including statistics about this, `scalar diagnose` now makes it easier to identify such scenarios. Signed-off-by: Matthew John Cheetham Signed-off-by: Johannes Schindelin --- contrib/scalar/scalar.c | 59 ++++++++++++++++++++++++++++++++ contrib/scalar/t/t9099-scalar.sh | 5 ++- 2 files changed, 63 insertions(+), 1 deletion(-) diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c index e8e0a5ec473..03da7452d83 100644 --- a/contrib/scalar/scalar.c +++ b/contrib/scalar/scalar.c @@ -619,6 +619,60 @@ static int dir_file_stats(struct object_directory *object_dir, void *data) return 0; } +static int count_files(char *path) +{ + DIR *dir = opendir(path); + struct dirent *e; + int count = 0; + + if (!dir) + return 0; + + while ((e = readdir(dir)) != NULL) + if (!is_dot_or_dotdot(e->d_name) && e->d_type == DT_REG) + count++; + + closedir(dir); + return count; +} + +static void loose_objs_stats(struct strbuf *buf, const char *path) +{ + DIR *dir = opendir(path); + struct dirent *e; + int count; + int total = 0; + unsigned char c; + struct strbuf count_path = STRBUF_INIT; + size_t base_path_len; + + if (!dir) + return; + + strbuf_addstr(buf, "Object directory stats for "); + strbuf_add_absolute_path(buf, path); + strbuf_addstr(buf, ":\n"); + + strbuf_add_absolute_path(&count_path, path); + strbuf_addch(&count_path, '/'); + base_path_len = count_path.len; + + while ((e = readdir(dir)) != NULL) + if (!is_dot_or_dotdot(e->d_name) && + e->d_type == DT_DIR && strlen(e->d_name) == 2 && + !hex_to_bytes(&c, e->d_name, 1)) { + strbuf_setlen(&count_path, base_path_len); + strbuf_addstr(&count_path, e->d_name); + total += (count = count_files(count_path.buf)); + strbuf_addf(buf, "%s : %7d files\n", e->d_name, count); + } + + strbuf_addf(buf, "Total: %d loose objects", total); + + strbuf_release(&count_path); + closedir(dir); +} + static int cmd_diagnose(int argc, const char **argv) { struct option options[] = { @@ -687,6 +741,11 @@ static int cmd_diagnose(int argc, const char **argv) foreach_alt_odb(dir_file_stats, &buf); strvec_push(&archiver_args, buf.buf); + strbuf_reset(&buf); + strbuf_addstr(&buf, "--add-file-with-content=objects-local.txt:"); + loose_objs_stats(&buf, ".git/objects"); + strvec_push(&archiver_args, buf.buf); + if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) || (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) || (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) || diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh index 3dd5650cceb..72023a1ca1d 100755 --- a/contrib/scalar/t/t9099-scalar.sh +++ b/contrib/scalar/t/t9099-scalar.sh @@ -95,6 +95,7 @@ test_expect_success UNZIP 'scalar diagnose' ' scalar clone "file://$(pwd)" cloned --single-branch && git repack && echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates && + test_commit -C cloned/src loose && scalar diagnose cloned >out 2>err && grep "Available space" out && sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" zip_path && @@ -106,7 +107,9 @@ test_expect_success UNZIP 'scalar diagnose' ' unzip -p "$zip_path" diagnostics.log >out && test_file_not_empty out && unzip -p "$zip_path" packs-local.txt >out && - grep "$(pwd)/.git/objects" out + grep "$(pwd)/.git/objects" out && + unzip -p "$zip_path" objects-local.txt >out && + grep "^Total: [1-9]" out ' test_done