From patchwork Tue Mar 29 00:42:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FA9FC433EF for ; Tue, 29 Mar 2022 00:42:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231653AbiC2AoU (ORCPT ); Mon, 28 Mar 2022 20:44:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231641AbiC2AoT (ORCPT ); Mon, 28 Mar 2022 20:44:19 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53EDF237FFF for ; Mon, 28 Mar 2022 17:42:37 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id r64so9339658wmr.4 for ; Mon, 28 Mar 2022 17:42:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=VcBruSMhra7Hu3xR7auyp+juizXtAATaYi6To93RDbY=; b=HUN/1PLCB+NTYhba6DF8/dIt3AVKPZPwuSSMxBeB1rBA3m1d3EI5XdjEaPOh6JF7RE 8JKe8xaF4M/FGV7BPH8afYFp0+hF8RLWU8UdMSm7Z1K7HG+19Q5IyCiUKD+0x1pJk6xZ o6rW0f6An+cxqgYYVD0Ic8fURHdXQmD4do8tY8h5GvE0d9h7PJTBirkai7ilr9GfpIy+ B/vecmSeL1sxlJol1+hpk/I0CskRqoK2VxAn+F8VxVhHpfcUZn7G1AoDfMWNFqvnt1Zf W+EyzLaM3xXAy1lzExIioBR0scZmqSW7QyrDwm362M4p42STmF37vo6qotqD7rHUl0iz c5dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=VcBruSMhra7Hu3xR7auyp+juizXtAATaYi6To93RDbY=; b=2F9M3l4rlO4fnd+ApJ3rSma+EnMOy0ugo8OOBr6gxz9Yll0XJSU6IGmSdLGt4MNuNB Tr6RQfJEOsu5kcodofnCcgxwXjaxncm5u1h+NkDBzzc3aE20fOLinPJBK8i9HbOXyqxa 7x0XH9elsS9GeFIBqEom58zQ7MCH6nVonknqpmE7tVb/ObOHc0CPhKzhHGFts+KgekT2 AJomemfoj1uFcwYaK9ukgDXPaC0vJPYSnX/Q+W7oiAFqmt0Sm1IpCcqCM+Kk3O3QIheG ZcDgWDtinkIs8TvHloE9JwUtNJJ4fKefW3goHTMI9GPGZwHVz7b07tUtDd2vgZ+mJqjd 3uMQ== X-Gm-Message-State: AOAM531bmZTzcNfiRQNW0AzKa9M2ozMvm2/gfCamNZ18SgU7GWwmigy1 4HJMur/cxv84tHpjyd+dpL6saOQTo7g= X-Google-Smtp-Source: ABdhPJxqmbKV2weoWsteyljCxCbuXyvgzKzly3XZQebC7SHOR+pSIWHrJWbIXbYDIyBlGfYMmMbK8A== X-Received: by 2002:a05:600c:3505:b0:38c:a3a8:8479 with SMTP id h5-20020a05600c350500b0038ca3a88479mr2907207wmq.4.1648514555666; Mon, 28 Mar 2022 17:42:35 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t6-20020a05600c198600b0038cafe3d47dsm768053wmq.42.2022.03.28.17.42.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:35 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:18 +0000 Subject: [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh This commit prepares for adding batch-fsync to the bulk-checkin infrastructure. The bulk-checkin infrastructure is currently used to batch up addition of large blobs to a packfile. When a blob is larger than big_file_threshold, we unconditionally add it to a pack. If bulk checkins are 'plugged', we allow multiple large blobs to be added to a single pack until we reach the packfile size limit; otherwise, we simply make a new packfile for each large blob. The 'unplug' call tells us when the series of blob additions is done so that we can finish the packfiles and make their objects available to subsequent operations. Stated another way, bulk-checkin allows callers to define a transaction that adds multiple objects to the object database, where the object database can optimize its internal operations within the transaction boundary. Batched fsync will fit into bulk-checkin by taking advantage of the plug/unplug functionality to determine the appropriate time to fsync and make newly-added objects available in the primary object database. * Rename 'state' variable to 'bulk_checkin_state', since we will later be adding 'bulk_fsync_objdir'. This also makes the variable easier to find in the debugger, since the name is more unique. * Move the 'plugged' data member of 'bulk_checkin_state' into a separate static variable. Doing this avoids resetting the variable in finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we seem to unintentionally disable the plugging functionality the first time a new packfile must be created due to packfile size limits. While disabling the plugging state only results in suboptimal behavior for the current code, it would be fatal for the bulk-fsync functionality later in this patch series. Signed-off-by: Neeraj Singh --- bulk-checkin.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/bulk-checkin.c b/bulk-checkin.c index 6d6c37171c9..577b135e39c 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -10,9 +10,9 @@ #include "packfile.h" #include "object-store.h" -static struct bulk_checkin_state { - unsigned plugged:1; +static int bulk_checkin_plugged; +static struct bulk_checkin_state { char *pack_tmp_name; struct hashfile *f; off_t offset; @@ -21,7 +21,7 @@ static struct bulk_checkin_state { struct pack_idx_entry **written; uint32_t alloc_written; uint32_t nr_written; -} state; +} bulk_checkin_state; static void finish_tmp_packfile(struct strbuf *basename, const char *pack_tmp_name, @@ -278,21 +278,23 @@ int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags) { - int status = deflate_to_pack(&state, oid, fd, size, type, + int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type, path, flags); - if (!state.plugged) - finish_bulk_checkin(&state); + if (!bulk_checkin_plugged) + finish_bulk_checkin(&bulk_checkin_state); return status; } void plug_bulk_checkin(void) { - state.plugged = 1; + assert(!bulk_checkin_plugged); + bulk_checkin_plugged = 1; } void unplug_bulk_checkin(void) { - state.plugged = 0; - if (state.f) - finish_bulk_checkin(&state); + assert(bulk_checkin_plugged); + bulk_checkin_plugged = 0; + if (bulk_checkin_state.f) + finish_bulk_checkin(&bulk_checkin_state); } From patchwork Tue Mar 29 00:42:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20033C433FE for ; Tue, 29 Mar 2022 00:42:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231661AbiC2AoX (ORCPT ); Mon, 28 Mar 2022 20:44:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231651AbiC2AoU (ORCPT ); Mon, 28 Mar 2022 20:44:20 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C670238D1F for ; Mon, 28 Mar 2022 17:42:38 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id u3so22631702wrg.3 for ; Mon, 28 Mar 2022 17:42:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=TOI+vuPMSYdUksjZQAWmkRtVe6MxK9DxmcwWwfN9fG4=; b=dDEret9qhIZ1PLqnYpJ53CaWZgpA8yXOtG6HyyKYbntiGl1nhG4FieTxaI9rEKtxYM ktG7t+dZ5NEi37c5ojYTo9FnQfdk2Uv5clIFA4kv8bdhxwMr0caq6tjCwxwq8ZSn7iFl BmJKcA1lynmxWze0zniEAtjZnIQSKldQPy79C0Ti9NZGM3r8Vk/oTs/U7xO4vpBI09cQ HFkV/kzAJGye3uV5EbVWWTYLIJnu0fkPq+y+Tlz9f+AkiEF+mN4TyhDQatUEo29nF/Jb Pxzj4EKKlrfFHV+/z7qu7DLWO5FZ3JcKZD/ctZ01z8erY30gLf9Oq27F4xQwfhxXFmAE GsQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=TOI+vuPMSYdUksjZQAWmkRtVe6MxK9DxmcwWwfN9fG4=; b=xVCNdJWZiFRvzF//taK9pXJhm4ig2Y0ozIumVfT5V/1v2Pq/wNOuntszLLV3kUrwuP m93ilpDOByrMBh+vEXdDTXJPsrrqxfGzIUsE1JBIVjG/gFk6Bpv1fRafvD+TvRdP5lpg 8Q2ycCi6Dk3CKDLKOjsd702wY897eypgaklgp2uywTADNb+KMK1WvJCHmD1Q469vC61Y V5CqTTpWsbUxF/ApLIo+6GMjY1F7q5QNYWUXtHfsdD8ajWXFgzQwiMQjshJLvzEekCky kZw+1PcnrSSBDKvrR2pRptS4s6vToO2cqvNbntHGifyURigbD5aZXHUSPG7AQc6NVMur XXGA== X-Gm-Message-State: AOAM532zIK40XFDMEWW27m2hQjFswuqWccKdNIJ1oXpr7mHSCU48mGTL uyWXwjCAlUNeEikrNARDZyMLxgWjmDI= X-Google-Smtp-Source: ABdhPJzyd05jC/48y9Q934STxdxYYDtHNIy5TWpSZGyqgRLu8gD4JmYFMOLfJ7CP5f4ZHyO5gwBlew== X-Received: by 2002:a5d:6b0d:0:b0:1f0:6497:b071 with SMTP id v13-20020a5d6b0d000000b001f06497b071mr27673079wrw.638.1648514556624; Mon, 28 Mar 2022 17:42:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n22-20020a05600c4f9600b0038c6ec42c38sm738834wmq.6.2022.03.28.17.42.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:36 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:19 +0000 Subject: [PATCH v4 02/13] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Make it clearer in the naming and documentation of the plug_bulk_checkin and unplug_bulk_checkin APIs that they can be thought of as a "transaction" to optimize operations on the object database. These transactions may be nested so that subsystems like the cache-tree writing code can optimize their operations without caring whether the top-level code has a transaction active. Signed-off-by: Neeraj Singh --- builtin/add.c | 4 ++-- bulk-checkin.c | 20 ++++++++++++-------- bulk-checkin.h | 14 ++++++++++++-- 3 files changed, 26 insertions(+), 12 deletions(-) diff --git a/builtin/add.c b/builtin/add.c index 3ffb86a4338..9bf37ceae8e 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -670,7 +670,7 @@ int cmd_add(int argc, const char **argv, const char *prefix) string_list_clear(&only_match_skip_worktree, 0); } - plug_bulk_checkin(); + begin_odb_transaction(); if (add_renormalize) exit_status |= renormalize_tracked_files(&pathspec, flags); @@ -682,7 +682,7 @@ int cmd_add(int argc, const char **argv, const char *prefix) if (chmod_arg && pathspec.nr) exit_status |= chmod_pathspec(&pathspec, chmod_arg[0], show_only); - unplug_bulk_checkin(); + end_odb_transaction(); finish: if (write_locked_index(&the_index, &lock_file, diff --git a/bulk-checkin.c b/bulk-checkin.c index 577b135e39c..8b0fd5c7723 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -10,7 +10,7 @@ #include "packfile.h" #include "object-store.h" -static int bulk_checkin_plugged; +static int odb_transaction_nesting; static struct bulk_checkin_state { char *pack_tmp_name; @@ -280,21 +280,25 @@ int index_bulk_checkin(struct object_id *oid, { int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type, path, flags); - if (!bulk_checkin_plugged) + if (!odb_transaction_nesting) finish_bulk_checkin(&bulk_checkin_state); return status; } -void plug_bulk_checkin(void) +void begin_odb_transaction(void) { - assert(!bulk_checkin_plugged); - bulk_checkin_plugged = 1; + odb_transaction_nesting += 1; } -void unplug_bulk_checkin(void) +void end_odb_transaction(void) { - assert(bulk_checkin_plugged); - bulk_checkin_plugged = 0; + odb_transaction_nesting -= 1; + if (odb_transaction_nesting < 0) + BUG("Unbalanced ODB transaction nesting"); + + if (odb_transaction_nesting) + return; + if (bulk_checkin_state.f) finish_bulk_checkin(&bulk_checkin_state); } diff --git a/bulk-checkin.h b/bulk-checkin.h index b26f3dc3b74..69a94422ac7 100644 --- a/bulk-checkin.h +++ b/bulk-checkin.h @@ -10,7 +10,17 @@ int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags); -void plug_bulk_checkin(void); -void unplug_bulk_checkin(void); +/* + * Tell the object database to optimize for adding + * multiple objects. end_odb_transaction must be called + * to make new objects visible. + */ +void begin_odb_transaction(void); + +/* + * Tell the object database to make any objects from the + * current transaction visible. + */ +void end_odb_transaction(void); #endif From patchwork Tue Mar 29 00:42:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 974B9C433EF for ; Tue, 29 Mar 2022 00:42:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231670AbiC2AoY (ORCPT ); Mon, 28 Mar 2022 20:44:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231656AbiC2AoV (ORCPT ); Mon, 28 Mar 2022 20:44:21 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B142237FFF for ; Mon, 28 Mar 2022 17:42:39 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id w21so18045255wra.2 for ; Mon, 28 Mar 2022 17:42:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=W9Vimte0sGyR2DMe0rSIsRe6vitxNiJv2vsesuc0ht0=; b=cd9jXdNIiJZFsZrT8zBQkwPGEoD8Ot2VIXq+u8ggdG4+5XE3g6AhCtYdO9a8Zp3RFf 2XIgegBTuQLJB/b6hza8CRQCkb1+xPZ379I4PEHGrymYjXnvRShoDY9+iUYkeRE+2k03 NSUU3xc83q7uXX+n+XI3BJ2yT8CjKHY/HJ/cKKMbHhrEI5QSgNfygS2oXW/ogpdsq/h4 Qvg63RGon184At10ad6L5M4lHHMmGULu0OIaezj/dfx+aloo/+3SJy1JkM8US89iADbT uPn7IJOhcpRNMpBeMC7NcsX4EpcFQPd97Zk0oHRS/+9orqm42z0SRpiRhNcCzW7pnHfr xaeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=W9Vimte0sGyR2DMe0rSIsRe6vitxNiJv2vsesuc0ht0=; b=dRQmYGwQYIPeFWhJH0kv4pKIJi4xUUFyKGS9WGF/7lY+8jtBUYZgWj0eLTJvVuSC+3 L/oQTqsjhXxc+syyGKxmDtXBFZke7UaO0UsuNrtEv+m2y+ohrAZbX1pGNL13oN5KE3uA 0oyKZ437Tpru9IdkRT7ryZPE2ngEF7k3Xhnk1uhqfj98fw+1y4BOaJS/ljZVoA4t8lZK se0vsDSyNrV8tg4hA498IQNudclFWPr/yVOS5hnWv1JzKki6WBKdk/a9yAuOpcoWZw02 r/UQWadttzs/5HG1gYuBrzEeH6EqCRMKny6amamLqr8JmB9uKB9SS5udG9Mi+1KfiyG0 UMWQ== X-Gm-Message-State: AOAM530qpqIWyBwBED3oepGGTgFBFjhUHdzEoGLfwoSyGL+4GHhxGJRv SLb0PQZPIhHoaDqRPUe3dibzNo1b6zo= X-Google-Smtp-Source: ABdhPJwHmuhrU2P2nmApNe7aidNS12Dm7X+VlzLrJcA9SshRb1JyLOVJqv4MIXTFG6A6F97lGCQrFA== X-Received: by 2002:adf:ed82:0:b0:205:9cf1:20fe with SMTP id c2-20020adfed82000000b002059cf120femr23663348wro.660.1648514557509; Mon, 28 Mar 2022 17:42:37 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r65-20020a1c4444000000b0038c48dd23b9sm1120847wma.5.2022.03.28.17.42.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:37 -0700 (PDT) Message-Id: <2d1bc4568ac744f11c886a5f964dbe563c04ce8b.1648514553.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:20 +0000 Subject: [PATCH v4 03/13] object-file: pass filename to fsync_or_die Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh If we die while trying to fsync a loose object file, pass the actual filename we're trying to sync. This is likely to be more helpful for a user trying to diagnose the cause of the failure than the former 'loose object file' string. It also sidesteps any concerns about translating the die message differently for loose objects versus something else that has a real path. Signed-off-by: Neeraj Singh --- object-file.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/object-file.c b/object-file.c index b254bc50d70..5ffbf3d4fd4 100644 --- a/object-file.c +++ b/object-file.c @@ -1888,16 +1888,16 @@ void hash_object_file(const struct git_hash_algo *algo, const void *buf, } /* Finalize a file on disk, and close it. */ -static void close_loose_object(int fd) +static void close_loose_object(int fd, const char *filename) { if (the_repository->objects->odb->will_destroy) goto out; if (fsync_object_files > 0) - fsync_or_die(fd, "loose object file"); + fsync_or_die(fd, filename); else fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd, - "loose object file"); + filename); out: if (close(fd) != 0) @@ -2011,7 +2011,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr, die(_("confused by unstable object source data for %s"), oid_to_hex(oid)); - close_loose_object(fd); + close_loose_object(fd, tmp_file.buf); if (mtime) { struct utimbuf utb; From patchwork Tue Mar 29 00:42:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A8F0C433F5 for ; Tue, 29 Mar 2022 00:42:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231675AbiC2Ao1 (ORCPT ); Mon, 28 Mar 2022 20:44:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231645AbiC2AoW (ORCPT ); Mon, 28 Mar 2022 20:44:22 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13FE2239328 for ; Mon, 28 Mar 2022 17:42:40 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id r7so21482801wrc.0 for ; Mon, 28 Mar 2022 17:42:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=QWCWFNWsBB6VcLQ3/PKqrQkuCpYYV6HLUH3YP6F+dLc=; b=i4voaN2YGFacF1PofY7ca4OgTk8iunIPd1tJ7Z7/El4wzvItD0sk1mVXGuHnjp9jX7 M6ImI/c+ALUILZL4ABgSTCTk9p7cpuwkvW9MMMjBxb+8WwnksK0WpliArk18hgXkiStW KASZkbKuJckV0Cg11RiV/s3DHzSe15Yrg9reHeEVR7bOId0+IMmICPmB9n0zUgfGP7We srHhKaFK+Tcxgl3mRkJCdO0PfdgJzkkXYI/MUB1KFgvh7vVuwtcXB15rcn0jj2PGWYOv vWzItGwczjGcNl+pVjmJ+RfXZLyntD5G1HChBX3uJtdbi0HXMufTST60cb8iimINxgwc JdJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=QWCWFNWsBB6VcLQ3/PKqrQkuCpYYV6HLUH3YP6F+dLc=; b=aP1whfzr8BTTGUm3JSWABqCZPUavl12FBYWN48hlIwmiRDsL2KbB6jvfmRTNZJOdRL iAAxdKqml1x/j/UIVCoIEn2ttp7HRbrcZg/P1sYOaNmXBYQa3NQimmrIiVURakd73WGC ainHCA2YDXjuy+vZCxr+uOOnAxI34F8v6K+zeFtVapO41vLB0DKNzfv+GFmKE3zrFiKe FA8cZ76kWBxiryK3Ar/IBgFxpmQW4lOcglzhHJxIVdpkDrxF1Xn3rnauaMy55PEXrHkf 6qVK+d8Nn8DYwxlKcNfN7iXmU0j2d8PLhXSofFsOiW8RKOjOeVvK9u1wxAU9m4S5Cng3 t0xw== X-Gm-Message-State: AOAM533R2120DNrSU22vnaC9/VJRuvlD6XOQlRKRkTpJ9QLi9FXPNlp1 Ec9+Nd5/FBEMiW/+W+Z13VZ85QU408k= X-Google-Smtp-Source: ABdhPJziNs+OLJu46uRsYwBM4HpMFI+HXeYX2VwtCgsIZrizt4k0EMBbkArzIQ6KzD4GDUvp2mnrfw== X-Received: by 2002:adf:f84c:0:b0:203:f4b9:4213 with SMTP id d12-20020adff84c000000b00203f4b94213mr26408480wrq.27.1648514558363; Mon, 28 Mar 2022 17:42:38 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e8-20020a05600c2dc800b0038d05f2b34dsm1244475wmh.2.2022.03.28.17.42.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:37 -0700 (PDT) Message-Id: <9e7ae22fa4a2693fe26659f875dd780080c4cfb2.1648514553.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:21 +0000 Subject: [PATCH v4 04/13] core.fsyncmethod: batched disk flushes for loose-objects Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh When adding many objects to a repo with `core.fsync=loose-object`, the cost of fsync'ing each object file can become prohibitive. One major source of the cost of fsync is the implied flush of the hardware writeback cache within the disk drive. This commit introduces a new `core.fsyncMethod=batch` option that batches up hardware flushes. It hooks into the bulk-checkin odb-transaction functionality, takes advantage of tmp-objdir, and uses the writeout-only support code. When the new mode is enabled, we do the following for each new object: 1a. Create the object in a tmp-objdir. 2a. Issue a pagecache writeback request and wait for it to complete. At the end of the entire transaction when unplugging bulk checkin: 1b. Issue an fsync against a dummy file to flush the log and hardware writeback cache, which should by now have seen the tmp-objdir writes. 2b. Rename all of the tmp-objdir files to their final names. 3b. When updating the index and/or refs, we assume that Git will issue another fsync internal to that operation. This is not the default today, but the user now has the option of syncing the index and there is a separate patch series to implement syncing of refs. On a filesystem with a singular journal that is updated during name operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS we would expect the fsync to trigger a journal writeout so that this sequence is enough to ensure that the user's data is durable by the time the git command returns. This sequence also ensures that no object files appear in the main object store unless they are fsync-durable. Batch mode is only enabled if core.fsync includes loose-objects. If the legacy core.fsyncObjectFiles setting is enabled, but core.fsync does not include loose-objects, we will use file-by-file fsyncing. In step (1a) of the sequence, the tmp-objdir is created lazily to avoid work if no loose objects are ever added to the ODB. We use a tmp-objdir to maintain the invariant that no loose-objects are visible in the main ODB unless they are properly fsync-durable. This is important since future ODB operations that try to create an object with specific contents will silently drop the new data if an object with the target hash exists without checking that the loose-object contents match the hash. Only a full git-fsck would restore the ODB to a functional state where dataloss doesn't occur. In step (1b) of the sequence, we issue a fsync against a dummy file created specifically for the purpose. This method has a little higher cost than using one of the input object files, but makes adding new callers of this mechanism easier, since we don't need to figure out which object file is "last" or risk sharing violations by caching the fd of the last object file. _Performance numbers_: Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD. Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD. Windows - Same host as Linux, a preview version of Windows 11. Adding 500 files to the repo with 'git add' Times reported in seconds. object file syncing | Linux | Mac | Windows --------------------|-------|-------|-------- disabled | 0.06 | 0.35 | 0.61 fsync | 1.88 | 11.18 | 2.47 batch | 0.15 | 0.41 | 1.53 Signed-off-by: Neeraj Singh --- Documentation/config/core.txt | 8 ++++ bulk-checkin.c | 71 +++++++++++++++++++++++++++++++++++ bulk-checkin.h | 3 ++ cache.h | 8 +++- config.c | 2 + object-file.c | 7 +++- 6 files changed, 97 insertions(+), 2 deletions(-) diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index 9da3e5d88f6..3c90ba0b395 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -596,6 +596,14 @@ core.fsyncMethod:: * `writeout-only` issues pagecache writeback requests, but depending on the filesystem and storage hardware, data added to the repository may not be durable in the event of a system crash. This is the default mode on macOS. +* `batch` enables a mode that uses writeout-only flushes to stage multiple + updates in the disk writeback cache and then does a single full fsync of + a dummy file to trigger the disk cache flush at the end of the operation. ++ + Currently `batch` mode only applies to loose-object files. Other repository + data is made durable as if `fsync` was specified. This mode is expected to + be as safe as `fsync` on macOS for repos stored on HFS+ or APFS filesystems + and on Windows for repos stored on NTFS or ReFS filesystems. core.fsyncObjectFiles:: This boolean will enable 'fsync()' when writing object files. diff --git a/bulk-checkin.c b/bulk-checkin.c index 8b0fd5c7723..9799d247cad 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -3,15 +3,20 @@ */ #include "cache.h" #include "bulk-checkin.h" +#include "lockfile.h" #include "repository.h" #include "csum-file.h" #include "pack.h" #include "strbuf.h" +#include "string-list.h" +#include "tmp-objdir.h" #include "packfile.h" #include "object-store.h" static int odb_transaction_nesting; +static struct tmp_objdir *bulk_fsync_objdir; + static struct bulk_checkin_state { char *pack_tmp_name; struct hashfile *f; @@ -80,6 +85,40 @@ clear_exit: reprepare_packed_git(the_repository); } +/* + * Cleanup after batch-mode fsync_object_files. + */ +static void do_batch_fsync(void) +{ + struct strbuf temp_path = STRBUF_INIT; + struct tempfile *temp; + + if (!bulk_fsync_objdir) + return; + + /* + * Issue a full hardware flush against a temporary file to ensure + * that all objects are durable before any renames occur. The code in + * fsync_loose_object_bulk_checkin has already issued a writeout + * request, but it has not flushed any writeback cache in the storage + * hardware or any filesystem logs. This fsync call acts as a barrier + * to ensure that the data in each new object file is durable before + * the final name is visible. + */ + strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory()); + temp = xmks_tempfile(temp_path.buf); + fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp)); + delete_tempfile(&temp); + strbuf_release(&temp_path); + + /* + * Make the object files visible in the primary ODB after their data is + * fully durable. + */ + tmp_objdir_migrate(bulk_fsync_objdir); + bulk_fsync_objdir = NULL; +} + static int already_written(struct bulk_checkin_state *state, struct object_id *oid) { int i; @@ -274,6 +313,36 @@ static int deflate_to_pack(struct bulk_checkin_state *state, return 0; } +void prepare_loose_object_bulk_checkin(void) +{ + /* + * We lazily create the temporary object directory + * the first time an object might be added, since + * callers may not know whether any objects will be + * added at the time they call begin_odb_transaction. + */ + if (!odb_transaction_nesting || bulk_fsync_objdir) + return; + + bulk_fsync_objdir = tmp_objdir_create("bulk-fsync"); + if (bulk_fsync_objdir) + tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0); +} + +void fsync_loose_object_bulk_checkin(int fd, const char *filename) +{ + /* + * If we have an active ODB transaction, we issue a call that + * cleans the filesystem page cache but avoids a hardware flush + * command. Later on we will issue a single hardware flush + * before as part of do_batch_fsync. + */ + if (!bulk_fsync_objdir || + git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) { + fsync_or_die(fd, filename); + } +} + int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags) @@ -301,4 +370,6 @@ void end_odb_transaction(void) if (bulk_checkin_state.f) finish_bulk_checkin(&bulk_checkin_state); + + do_batch_fsync(); } diff --git a/bulk-checkin.h b/bulk-checkin.h index 69a94422ac7..70edf745be8 100644 --- a/bulk-checkin.h +++ b/bulk-checkin.h @@ -6,6 +6,9 @@ #include "cache.h" +void prepare_loose_object_bulk_checkin(void); +void fsync_loose_object_bulk_checkin(int fd, const char *filename); + int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags); diff --git a/cache.h b/cache.h index ef7d34b7a09..a5bf15a5131 100644 --- a/cache.h +++ b/cache.h @@ -1040,7 +1040,8 @@ extern int use_fsync; enum fsync_method { FSYNC_METHOD_FSYNC, - FSYNC_METHOD_WRITEOUT_ONLY + FSYNC_METHOD_WRITEOUT_ONLY, + FSYNC_METHOD_BATCH, }; extern enum fsync_method fsync_method; @@ -1767,6 +1768,11 @@ void fsync_or_die(int fd, const char *); int fsync_component(enum fsync_component component, int fd); void fsync_component_or_die(enum fsync_component component, int fd, const char *msg); +static inline int batch_fsync_enabled(enum fsync_component component) +{ + return (fsync_components & component) && (fsync_method == FSYNC_METHOD_BATCH); +} + ssize_t read_in_full(int fd, void *buf, size_t count); ssize_t write_in_full(int fd, const void *buf, size_t count); ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset); diff --git a/config.c b/config.c index 3c9b6b589ab..511f4584eeb 100644 --- a/config.c +++ b/config.c @@ -1688,6 +1688,8 @@ static int git_default_core_config(const char *var, const char *value, void *cb) fsync_method = FSYNC_METHOD_FSYNC; else if (!strcmp(value, "writeout-only")) fsync_method = FSYNC_METHOD_WRITEOUT_ONLY; + else if (!strcmp(value, "batch")) + fsync_method = FSYNC_METHOD_BATCH; else warning(_("ignoring unknown core.fsyncMethod value '%s'"), value); diff --git a/object-file.c b/object-file.c index 5ffbf3d4fd4..d2e0c13198f 100644 --- a/object-file.c +++ b/object-file.c @@ -1893,7 +1893,9 @@ static void close_loose_object(int fd, const char *filename) if (the_repository->objects->odb->will_destroy) goto out; - if (fsync_object_files > 0) + if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) + fsync_loose_object_bulk_checkin(fd, filename); + else if (fsync_object_files > 0) fsync_or_die(fd, filename); else fsync_component_or_die(FSYNC_COMPONENT_LOOSE_OBJECT, fd, @@ -1961,6 +1963,9 @@ static int write_loose_object(const struct object_id *oid, char *hdr, static struct strbuf tmp_file = STRBUF_INIT; static struct strbuf filename = STRBUF_INIT; + if (batch_fsync_enabled(FSYNC_COMPONENT_LOOSE_OBJECT)) + prepare_loose_object_bulk_checkin(); + loose_object_path(the_repository, &filename, oid); fd = create_tmpfile(&tmp_file, filename.buf); From patchwork Tue Mar 29 00:42:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794348 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D7CAC433F5 for ; Tue, 29 Mar 2022 00:42:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231685AbiC2Ao3 (ORCPT ); Mon, 28 Mar 2022 20:44:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231651AbiC2AoX (ORCPT ); Mon, 28 Mar 2022 20:44:23 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CEE1E23B3C6 for ; Mon, 28 Mar 2022 17:42:40 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id j18so22580287wrd.6 for ; Mon, 28 Mar 2022 17:42:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=0FZmb8xpqwGRKdPb/QtLOYQHy803cgaUCogAov31nBg=; b=fxEkCir+tSmvlSdbik5xz1egi7RRt1Ln6FBscYrTPShtP5S1vV03d2V+TiA9L4XFsn S87PqAwNjx+FJxKcJeVI6RFSYUgV5WA2XUHQKeIVbppZzsWae5EPWKPtyrrs8XMM6v+B 6mI27AOWCruZXYrH/H3qAEQPExm3RLRSFD/JeK/4IYioYdJMWfIYpuh3vwTIZCbHfqmu sAKW4rReqL8MLrq97cqg5vBlRBlBiWALP0uP4uxff0Zhyncl0KCcsL6J3GcRISlxqqRL hQzCQotpaM7FYKJ8fkxWbVMCHUY3POzJRPa2tA8XhwyTUNHKKEJ7tDioq6Lh+6NbxjY1 YXQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=0FZmb8xpqwGRKdPb/QtLOYQHy803cgaUCogAov31nBg=; b=6LU7GxoLauz/tWkFny3s0kg/g7WaIfq+XIxXQ39RYorxxu/QYlbWtoaKNfn5jyQesX 43W3XobB/BaqvxNHoXmcBTahgbNRZR19yJM3UWRgk2P+yup6NW1DuwwXPuReE8mBhOIN 4/kSwEk+cGLC5fHknbxxUyM7PnQrat4It/a2OU8tIcNDTmYr/ph2ElHqtyu+x8xpBAyq PcAWiFF1CFDrQykPM27PJ4MNjkX3i2WivaAWthSbAI72zg+upk8/APfdsewNj3vvc7S6 Iax5x6zrcb7S42A182ZWFk1emUh53k4OtM+cgTkJnVE3QwpX3cs/MPTdvlwTrZ8sNCtd TR0w== X-Gm-Message-State: AOAM531s2YeqE4po6RzytKoaVELpz25ZaLR71lR/InCQJ/rnh14fXuHz Vc+Tl3Hw1722z6jqwygGnHOwWDB0a48= X-Google-Smtp-Source: ABdhPJy++dWzE1oOXpZrQgWSEdGTzrfRzXPddmJYUq2xXiIDa5n409itXhI7xj3Yn3ctaIN5djnDZg== X-Received: by 2002:a5d:6304:0:b0:205:c541:3eab with SMTP id i4-20020a5d6304000000b00205c5413eabmr8122322wru.469.1648514559283; Mon, 28 Mar 2022 17:42:39 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t4-20020adfe104000000b00205b50f04f0sm7261820wrz.86.2022.03.28.17.42.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:38 -0700 (PDT) Message-Id: <83fa4a5f3a5c79fa814932c0705867ff16a584c7.1648514553.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:22 +0000 Subject: [PATCH v4 05/13] cache-tree: use ODB transaction around writing a tree Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Take advantage of the odb transaction infrastructure around writing the cached tree to the object database. Signed-off-by: Neeraj Singh --- cache-tree.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/cache-tree.c b/cache-tree.c index 6752f69d515..8c5e8822716 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -3,6 +3,7 @@ #include "tree.h" #include "tree-walk.h" #include "cache-tree.h" +#include "bulk-checkin.h" #include "object-store.h" #include "replace-object.h" #include "promisor-remote.h" @@ -474,8 +475,10 @@ int cache_tree_update(struct index_state *istate, int flags) trace_performance_enter(); trace2_region_enter("cache_tree", "update", the_repository); + begin_odb_transaction(); i = update_one(istate->cache_tree, istate->cache, istate->cache_nr, "", 0, &skip, flags); + end_odb_transaction(); trace2_region_leave("cache_tree", "update", the_repository); trace_performance_leave("cache_tree_update"); if (i < 0) From patchwork Tue Mar 29 00:42:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4039C433F5 for ; Tue, 29 Mar 2022 00:42:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231690AbiC2Aoj (ORCPT ); Mon, 28 Mar 2022 20:44:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231663AbiC2AoX (ORCPT ); Mon, 28 Mar 2022 20:44:23 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC66B23B3DF for ; Mon, 28 Mar 2022 17:42:41 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id u16so22608316wru.4 for ; Mon, 28 Mar 2022 17:42:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=RIF5CAZ9hTXQxe5JpOeXS1KOLlyZ8DHo3e5ICWJYFAY=; b=SwKrIF7ivZ2WCsReI+7WW3ylyxjAMLg5sa38NjXxy3NJG+mUa00pT2dkjYcKx4+zum vzfWX6rh01ktgwIdp3ZLzmZdhy3TTSG5ioJDuMZDYgd0v6UFxJsriXeat/AeYNRK5DE7 9QIkbW6PyYvK+VVD+rN3aaRjy/UXhp36a15j8P4l7MbjwcVsvRbz9HdPU4eRWMkyT3Um Xldoeejw281S/kyganuKm9QIBdPJrV63qAOaciv+fu3SfnGKnMsqZcRa9y0hrBxMoe13 1ZxUNp2T3WX4bLzWIYQrDGmcueVzp2S0RPqze5RdeJTCeaDkHLeFWgb/5vWj/1vgZSgw YOsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=RIF5CAZ9hTXQxe5JpOeXS1KOLlyZ8DHo3e5ICWJYFAY=; b=58cPv4Un2MDBsppuZnt8ZFCW+xtqjTTAlwwKOx+n38ibNvBoKY9BngPJdUA3lmuV0U m2S3f6ZyKMnbZtEAUzt26RAyiCvbM5LElFKYcVsARFNurFuxmt/zhcQ/9R2EZ8QwpU0p VPQnf7Yky528uULR+JapkTVCWBLRAy70FDMJQbApHi392r+e4IFyWS3ZiJN9EWPhIzh3 ZCsE4t+VH5bnCmOUBvxwoXKfzwECaquhcvEq/37F/ab29uZLrwyXGKFd8REWgA+gwSsk 6+JzPjXKxxGx61yX7qavkXRS5VMm2b/BBX/GeM1gjEwizIu4J/f9TqBYBr1RgesEX6W3 /FSw== X-Gm-Message-State: AOAM53137FMSq63U5r5ZqRWE5/TlHeTmXPNAkV+s034C1aDE9UIPe3EJ jDAJscQuomGJ/GPbDnv8a8KrRw+cMhA= X-Google-Smtp-Source: ABdhPJwlVWPg8cNK5okeAiXS6+0kR/T2P3ebs7SOeS8SsWEdC238SuV/4JVge+j7YxNSKCpp02+DIw== X-Received: by 2002:a5d:4ad2:0:b0:203:d56d:9c82 with SMTP id y18-20020a5d4ad2000000b00203d56d9c82mr27526959wrs.307.1648514560069; Mon, 28 Mar 2022 17:42:40 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w8-20020a1cf608000000b0038c8fdc93d6sm710513wmc.28.2022.03.28.17.42.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:39 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:23 +0000 Subject: [PATCH v4 06/13] update-index: use the bulk-checkin infrastructure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh The update-index functionality is used internally by 'git stash push' to setup the internal stashed commit. This change enables odb-transactions for update-index infrastructure to speed up adding new objects to the object database by leveraging the batch fsync functionality. There is some risk with this change, since under batch fsync, the object files will be in a tmp-objdir until update-index is complete, so callers using the --stdin option will not see them until update-index is done. This risk is mitigated by not keeping an ODB transaction open around --stdin processing if in --verbose mode. Without --verbose mode, a caller feeding update-index via --stdin wouldn't know when update-index adds an object, event without an ODB transaction. Signed-off-by: Neeraj Singh --- builtin/update-index.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/builtin/update-index.c b/builtin/update-index.c index aafe7eeac2a..50f9063e1c6 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -5,6 +5,7 @@ */ #define USE_THE_INDEX_COMPATIBILITY_MACROS #include "cache.h" +#include "bulk-checkin.h" #include "config.h" #include "lockfile.h" #include "quote.h" @@ -1116,6 +1117,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) */ parse_options_start(&ctx, argc, argv, prefix, options, PARSE_OPT_STOP_AT_NON_OPTION); + + /* + * Allow the object layer to optimize adding multiple objects in + * a batch. + */ + begin_odb_transaction(); while (ctx.argc) { if (parseopt_state != PARSE_OPT_DONE) parseopt_state = parse_options_step(&ctx, options, @@ -1167,6 +1174,17 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) the_index.version = preferred_index_format; } + /* + * It is possible, though unlikely, that a caller could use the verbose + * output to synchronize with addition of objects to the object + * database. The current implementation of ODB transactions leaves + * objects invisible while a transaction is active, so end the + * transaction here if verbose output is enabled. + */ + + if (verbose) + end_odb_transaction(); + if (read_from_stdin) { struct strbuf buf = STRBUF_INIT; struct strbuf unquoted = STRBUF_INIT; @@ -1190,6 +1208,12 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) strbuf_release(&buf); } + /* + * By now we have added all of the new objects + */ + if (!verbose) + end_odb_transaction(); + if (split_index > 0) { if (git_config_get_split_index() == 0) warning(_("core.splitIndex is set to false; " From patchwork Tue Mar 29 00:42:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95755C433EF for ; Tue, 29 Mar 2022 00:42:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231696AbiC2Aoj (ORCPT ); Mon, 28 Mar 2022 20:44:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231666AbiC2AoY (ORCPT ); Mon, 28 Mar 2022 20:44:24 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B780239330 for ; Mon, 28 Mar 2022 17:42:42 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id d7so22599392wrb.7 for ; Mon, 28 Mar 2022 17:42:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5ebJFCI4F0cWCTTRJ8U340O4g9iVrZ+6k6F3XALLE/c=; b=i1BleUB6ONiRQiBa1SRXsILZa7NnQ1OiufvFSSKaDn5qsGJiHuQUAF4PRs97WYujzK g+8O4MwGn+Lt1BwrpnqIi2ztteYz9AYz95raYwCey3k5DhuLkOpBrl/v/Or7SkvEfimq 4j2MXyg/0PUNddQQiTvWd73h2RkBIVTJeDjFxLlbRMKFvDKbOVuI18oLS9PMYg5giI3L mFJpG+9pEvTcsZID7Jzg6ZzyCeRUR6wC7fmEAz/g3EfwFkKndum7CDKHDzBB0V9j1WYx geUtaP2+YJGI4foOhuPznoVoW5BW+3qcllGdb5khmiycTYD6fkNfwW9StidJ//MdNF5R FTsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5ebJFCI4F0cWCTTRJ8U340O4g9iVrZ+6k6F3XALLE/c=; b=H0I8vZeSqELegtv6KGGtBFQG/p26l+XKDQzXzju0KQ/MBlW7iLCGUB22Xq2HyboEiy nYQpq0y0aNYHfJY3BY5v7Xaey9gDuhpyzhXp5LKTix/MXv4uZ8XG4XPWBEx89uePia/P XG+lm3jLyLxG1fuxZQCIa3R83Kj8TYSkQch+4N8qjmxR7itB/01Jsbbv3Lbpw5p9HaFk U8/+9164QquEsPcqu9dox73XSx3eb7QNq8zqxIfBiUc3lAhdYcEbZMdguatz5UitbrDh YjnE4o30Bl8SioXvai/74jUttSs9OED9SwZfCib45VMSWamR3xFAZtO2cfMIf8kF/Hxl 58zg== X-Gm-Message-State: AOAM533bBYdenn5DhjDLeWsSnS4iTyc8UnMgq7m3BvH/ubiX9XzGib8Y 4A/JNBgjKS4O4LTw3ny1Ib13HAhhbVQ= X-Google-Smtp-Source: ABdhPJy4uO/bnGMdt6HVGBhpW29aTK2w3BwmCuhGVYUJu3bObJXvy+zxpL5MRDA8dQXRyHt4QMBdCA== X-Received: by 2002:adf:e241:0:b0:203:f56e:51e3 with SMTP id bl1-20020adfe241000000b00203f56e51e3mr27592548wrb.473.1648514560873; Mon, 28 Mar 2022 17:42:40 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 14-20020adf828e000000b00205b0fc825csm7955880wrc.65.2022.03.28.17.42.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:40 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:24 +0000 Subject: [PATCH v4 07/13] unpack-objects: use the bulk-checkin infrastructure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh The unpack-objects functionality is used by fetch, push, and fast-import to turn the transfered data into object database entries when there are fewer objects than the 'unpacklimit' setting. By enabling an odb-transaction when unpacking objects, we can take advantage of batched fsyncs. Here are some performance numbers to justify batch mode for unpack-objects, collected on a WSL2 Ubuntu VM. Fsync Mode | Time for 90 objects (ms) ------------------------------------- Off | 170 On,fsync | 760 On,batch | 230 Note that the default unpackLimit is 100 objects, so there's a 3x benefit in the worst case. The non-batch mode fsync scales linearly with the number of objects, so there are significant benefits even with smaller numbers of objects. Signed-off-by: Neeraj Singh --- builtin/unpack-objects.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index dbeb0680a58..56d05e2725d 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -1,5 +1,6 @@ #include "builtin.h" #include "cache.h" +#include "bulk-checkin.h" #include "config.h" #include "object-store.h" #include "object.h" @@ -503,10 +504,12 @@ static void unpack_all(void) if (!quiet) progress = start_progress(_("Unpacking objects"), nr_objects); CALLOC_ARRAY(obj_list, nr_objects); + begin_odb_transaction(); for (i = 0; i < nr_objects; i++) { unpack_one(i); display_progress(progress, i + 1); } + end_odb_transaction(); stop_progress(&progress); if (delta_list) From patchwork Tue Mar 29 00:42:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE997C433EF for ; Tue, 29 Mar 2022 00:43:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231693AbiC2Aol (ORCPT ); Mon, 28 Mar 2022 20:44:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231674AbiC2Ao0 (ORCPT ); Mon, 28 Mar 2022 20:44:26 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 676A5239328 for ; Mon, 28 Mar 2022 17:42:43 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id i131-20020a1c3b89000000b0038ce25c870dso443387wma.1 for ; Mon, 28 Mar 2022 17:42:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hg8plsugjefXxhMhSvzABM0c1aAu2C8TxytBC4+PwWI=; b=juBjBpLO1JtzKl86rik87aq2iGyOFk6fj4imeYiODfjvvqlqzULkMyk4qq+pTX8d9L 06YmckYxGXrOeIChq7OJsOFGeGXMmmfDY0kHUVmOYJfO+WAf5La2h5nGd9eVr91oLL3D yUiBc436v8Dp5+RuzC32teb/CIDFIca2kxwd6eHr3gSZC1I2SnZPLm5MUUtA6ihx6Nk6 4ctxDw72BSzBRIER+8t1n5+72Y2wkq0PSaKw3r803tutCJKQ2hh8nDsVH8UMm+dcNn7D RKtHKmw9+4lM3/wHuRr/J3a4aEcN2L1A0irgZyWsqGlrhIraV6Ru2fhynkSoEzMqUSco pF5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hg8plsugjefXxhMhSvzABM0c1aAu2C8TxytBC4+PwWI=; b=x0tR4y1I4ikKkUj3VmTbyvFmnWaaXKmW7eVKfdsfj03JKhaTJHwOfktLa1ykUK3zpR 5oenZLyfKElMJ+EE6aGXT3fYJn1DoXuBVJqvQXFi4hPQ3z4r1M17cE3n07Hg2aoIV9Jt olM4atzz3RIfu4/iirGD26KYP+jTqImn6XPz3Zig4m1u5pLZ8rGcnSsMxbD2AElf6SFP iboza3R49EWaY6rWRH+4Oi36IEQIGqlYcopsUxMIShk8bxya0n4uF2mX/04gxTAKIa2v GyJTj0CAeipbUfvvgiVL1pVYlDfeOAxraHTtfaIsY0rXCJvepRvIlvoa+sRokFbhX30i ANKw== X-Gm-Message-State: AOAM532d8Sbe2x/vB76nEAh6pzetVdlp0anUZ13QE/SH6Op/pLiEjyYz g54/tv3ZcDN0XdJCtHSgwneSOySpUOA= X-Google-Smtp-Source: ABdhPJxymHUknA7NTNT+g9QXt9tLwXcx7mPNjqLFRcWHzpEWFUsI2HZ5tXIt3gEXAy+X2iMRUyZarQ== X-Received: by 2002:a7b:c30d:0:b0:381:4bb9:eede with SMTP id k13-20020a7bc30d000000b003814bb9eedemr2741631wmj.74.1648514561711; Mon, 28 Mar 2022 17:42:41 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o5-20020a5d4a85000000b00205a8bb9c0dsm12811176wrq.90.2022.03.28.17.42.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:41 -0700 (PDT) Message-Id: <73e54f94c204759b0cf77e7b75501adb43b14994.1648514553.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:25 +0000 Subject: [PATCH v4 08/13] core.fsync: use batch mode and sync loose objects by default on Windows Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Git for Windows has defaulted to core.fsyncObjectFiles=true since September 2017. We turn on syncing of loose object files with batch mode in upstream Git so that we can get broad coverage of the new code upstream. We don't actually do fsyncs in the most of the test suite, since GIT_TEST_FSYNC is set to 0. However, we do exercise all of the surrounding batch mode code since GIT_TEST_FSYNC merely makes the maybe_fsync wrapper always appear to succeed. Signed-off-by: Neeraj Singh --- cache.h | 4 ++++ compat/mingw.h | 3 +++ config.c | 2 +- git-compat-util.h | 2 ++ 4 files changed, 10 insertions(+), 1 deletion(-) diff --git a/cache.h b/cache.h index a5bf15a5131..7f6cbb254b4 100644 --- a/cache.h +++ b/cache.h @@ -1031,6 +1031,10 @@ enum fsync_component { FSYNC_COMPONENT_INDEX | \ FSYNC_COMPONENT_REFERENCE) +#ifndef FSYNC_COMPONENTS_PLATFORM_DEFAULT +#define FSYNC_COMPONENTS_PLATFORM_DEFAULT FSYNC_COMPONENTS_DEFAULT +#endif + /* * A bitmask indicating which components of the repo should be fsynced. */ diff --git a/compat/mingw.h b/compat/mingw.h index 6074a3d3ced..afe30868c04 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -332,6 +332,9 @@ int mingw_getpagesize(void); int win32_fsync_no_flush(int fd); #define fsync_no_flush win32_fsync_no_flush +#define FSYNC_COMPONENTS_PLATFORM_DEFAULT (FSYNC_COMPONENTS_DEFAULT | FSYNC_COMPONENT_LOOSE_OBJECT) +#define FSYNC_METHOD_DEFAULT (FSYNC_METHOD_BATCH) + struct rlimit { unsigned int rlim_cur; }; diff --git a/config.c b/config.c index 511f4584eeb..e9cac5f4707 100644 --- a/config.c +++ b/config.c @@ -1342,7 +1342,7 @@ static const struct fsync_component_name { static enum fsync_component parse_fsync_components(const char *var, const char *string) { - enum fsync_component current = FSYNC_COMPONENTS_DEFAULT; + enum fsync_component current = FSYNC_COMPONENTS_PLATFORM_DEFAULT; enum fsync_component positive = 0, negative = 0; while (string) { diff --git a/git-compat-util.h b/git-compat-util.h index 0892e209a2f..fffe42ce7c1 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1257,11 +1257,13 @@ __attribute__((format (printf, 3, 4))) NORETURN void BUG_fl(const char *file, int line, const char *fmt, ...); #define BUG(...) BUG_fl(__FILE__, __LINE__, __VA_ARGS__) +#ifndef FSYNC_METHOD_DEFAULT #ifdef __APPLE__ #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_WRITEOUT_ONLY #else #define FSYNC_METHOD_DEFAULT FSYNC_METHOD_FSYNC #endif +#endif enum fsync_action { FSYNC_WRITEOUT_ONLY, From patchwork Tue Mar 29 00:42:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794356 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4ED6C433F5 for ; Tue, 29 Mar 2022 00:43:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231556AbiC2Aos (ORCPT ); Mon, 28 Mar 2022 20:44:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231645AbiC2Ao1 (ORCPT ); Mon, 28 Mar 2022 20:44:27 -0400 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32EA123C0EB for ; Mon, 28 Mar 2022 17:42:44 -0700 (PDT) Received: by mail-wm1-x332.google.com with SMTP id n63-20020a1c2742000000b0038d0c31db6eso524160wmn.1 for ; Mon, 28 Mar 2022 17:42:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=f/ZbN5quD6ggV2+4j9lRbjM6zktg3vb1d35PK4/3tT4=; b=oGm18Daxm6x6EO8YYuPMK2SfWW2iaVmTCSHm1TF+j1k3zWm137E/WQZIcwgivR53AY Ne3ifpYVYjuXPTSK+cedawzM/Bg4/3EoSut/ZrnZBSfHqnm5Px49qhPKXzBOb/haG5aM K6TwQnO+xR8T/wcYXcbJU0LcAbFl8vs76I5SMq2L2tj3FjWpqkzKzjndyiBOW0sU1+ig cLu7Nbz+b0tDK1CyXZf+rURtRT8QoXxclTeWpwak+1HtcpTbz2O90jZkJUdHRZaRtmok 4FxSru9x7+K0TFczvdSwhReM20p1SDc28FZg70cO4/9tj2bRKcO4/RdVj5LAFJtB9ezx BndQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=f/ZbN5quD6ggV2+4j9lRbjM6zktg3vb1d35PK4/3tT4=; b=S6zQ4tyjvhwGmuWYyuBKe7ImYQxVm0MJKzg2VgE0wBblQh0QghzailLnlksMPTJ8Gv f9XN/pMCVQsIjcC9k4r8BvA0NtrY6sXRgmy3DK8mNW9PSiU/4XRHQcDqKXF+QAiA5Sae O21/CSUOPatsWa8UUWme6uqfi5xES9Ae96lnTxOmwCx0sU7kE3rugC2kmGtYoW+w4zaa lhJ9urs9xq/lvLMZcmhWXhwqpkRYg10QoOqVlJV8Zf5iI117Y3Ub6kH4FjlNpyTJkXin I62NNW1PlGKwrtJKekDKXwqAELjGRsq8MyV3JWpN0lkr2IVZMRYTErNHI7AL8wuFVamF WLNA== X-Gm-Message-State: AOAM533yL16Z30rQ0Oyd9tJZh/ScBaZ52rAgYZPUk0AypiTIaF0zHrdF RxZbnGpxRGEurGSNpXuFUMK1/YZCJcU= X-Google-Smtp-Source: ABdhPJx0kZx91wqVdy/GypWQ1FR7kRT7azjCpT0JWGpJhu4eXmre4Vh1RruJpS7YePEEwiOsvJrcMg== X-Received: by 2002:a7b:c14c:0:b0:381:32fb:a128 with SMTP id z12-20020a7bc14c000000b0038132fba128mr2838163wmi.116.1648514562526; Mon, 28 Mar 2022 17:42:42 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v1-20020adf9e41000000b00205c3d212easm3197550wre.51.2022.03.28.17.42.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:42 -0700 (PDT) Message-Id: <124450c86d9f703dde0b5c4fa32e0bd08d4df009.1648514553.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:26 +0000 Subject: [PATCH v4 09/13] test-lib-functions: add parsing helpers for ls-files and ls-tree Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Several tests use awk to parse OIDs from the output of 'git ls-files --stage' and 'git ls-tree'. Introduce helpers to centralize these uses of awk. Update t5317-pack-objects-filter-objects.sh to use the new ls-files helper so that it has some usages to review. Other updates are left for the future. Signed-off-by: Neeraj Singh --- t/t5317-pack-objects-filter-objects.sh | 91 +++++++++++++------------- t/test-lib-functions.sh | 10 +++ 2 files changed, 54 insertions(+), 47 deletions(-) diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index 33b740ce628..bb633c9b099 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -10,9 +10,6 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME # Test blob:none filter. test_expect_success 'setup r1' ' - echo "{print \$1}" >print_1.awk && - echo "{print \$2}" >print_2.awk && - git init r1 && for n in 1 2 3 4 5 do @@ -22,10 +19,13 @@ test_expect_success 'setup r1' ' done ' +parse_verify_pack_blob_oid () { + awk '{print $1}' - +} + test_expect_success 'verify blob count in normal packfile' ' - git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 \ - >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 | + test_parse_ls_files_stage_oids | sort >expected && git -C r1 pack-objects --revs --stdout >all.pack <<-EOF && @@ -35,7 +35,7 @@ test_expect_success 'verify blob count in normal packfile' ' git -C r1 verify-pack -v ../all.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed @@ -54,12 +54,12 @@ test_expect_success 'verify blob:none packfile has no blobs' ' test_expect_success 'verify normal and blob:none packfiles have same commits/trees' ' git -C r1 verify-pack -v ../all.pack >verify_result && grep -E "commit|tree" verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >expected && git -C r1 verify-pack -v ../filter.pack >verify_result && grep -E "commit|tree" verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed @@ -123,8 +123,8 @@ test_expect_success 'setup r2' ' ' test_expect_success 'verify blob count in normal packfile' ' - git -C r2 ls-files -s large.1000 large.10000 >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r2 ls-files -s large.1000 large.10000 | + test_parse_ls_files_stage_oids | sort >expected && git -C r2 pack-objects --revs --stdout >all.pack <<-EOF && @@ -134,7 +134,7 @@ test_expect_success 'verify blob count in normal packfile' ' git -C r2 verify-pack -v ../all.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed @@ -161,8 +161,8 @@ test_expect_success 'verify blob:limit=1000' ' ' test_expect_success 'verify blob:limit=1001' ' - git -C r2 ls-files -s large.1000 >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r2 ls-files -s large.1000 | + test_parse_ls_files_stage_oids | sort >expected && git -C r2 pack-objects --revs --stdout --filter=blob:limit=1001 >filter.pack <<-EOF && @@ -172,15 +172,15 @@ test_expect_success 'verify blob:limit=1001' ' git -C r2 verify-pack -v ../filter.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed ' test_expect_success 'verify blob:limit=10001' ' - git -C r2 ls-files -s large.1000 large.10000 >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r2 ls-files -s large.1000 large.10000 | + test_parse_ls_files_stage_oids | sort >expected && git -C r2 pack-objects --revs --stdout --filter=blob:limit=10001 >filter.pack <<-EOF && @@ -190,15 +190,15 @@ test_expect_success 'verify blob:limit=10001' ' git -C r2 verify-pack -v ../filter.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed ' test_expect_success 'verify blob:limit=1k' ' - git -C r2 ls-files -s large.1000 >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r2 ls-files -s large.1000 | + test_parse_ls_files_stage_oids | sort >expected && git -C r2 pack-objects --revs --stdout --filter=blob:limit=1k >filter.pack <<-EOF && @@ -208,15 +208,15 @@ test_expect_success 'verify blob:limit=1k' ' git -C r2 verify-pack -v ../filter.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed ' test_expect_success 'verify explicitly specifying oversized blob in input' ' - git -C r2 ls-files -s large.1000 large.10000 >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r2 ls-files -s large.1000 large.10000 | + test_parse_ls_files_stage_oids | sort >expected && echo HEAD >objects && @@ -226,15 +226,15 @@ test_expect_success 'verify explicitly specifying oversized blob in input' ' git -C r2 verify-pack -v ../filter.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed ' test_expect_success 'verify blob:limit=1m' ' - git -C r2 ls-files -s large.1000 large.10000 >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r2 ls-files -s large.1000 large.10000 | + test_parse_ls_files_stage_oids | sort >expected && git -C r2 pack-objects --revs --stdout --filter=blob:limit=1m >filter.pack <<-EOF && @@ -244,7 +244,7 @@ test_expect_success 'verify blob:limit=1m' ' git -C r2 verify-pack -v ../filter.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed @@ -253,12 +253,12 @@ test_expect_success 'verify blob:limit=1m' ' test_expect_success 'verify normal and blob:limit packfiles have same commits/trees' ' git -C r2 verify-pack -v ../all.pack >verify_result && grep -E "commit|tree" verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >expected && git -C r2 verify-pack -v ../filter.pack >verify_result && grep -E "commit|tree" verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed @@ -289,9 +289,8 @@ test_expect_success 'setup r3' ' ' test_expect_success 'verify blob count in normal packfile' ' - git -C r3 ls-files -s sparse1 sparse2 dir1/sparse1 dir1/sparse2 \ - >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r3 ls-files -s sparse1 sparse2 dir1/sparse1 dir1/sparse2 | + test_parse_ls_files_stage_oids | sort >expected && git -C r3 pack-objects --revs --stdout >all.pack <<-EOF && @@ -301,7 +300,7 @@ test_expect_success 'verify blob count in normal packfile' ' git -C r3 verify-pack -v ../all.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed @@ -342,9 +341,8 @@ test_expect_success 'setup r4' ' ' test_expect_success 'verify blob count in normal packfile' ' - git -C r4 ls-files -s pattern sparse1 sparse2 dir1/sparse1 dir1/sparse2 \ - >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r4 ls-files -s pattern sparse1 sparse2 dir1/sparse1 dir1/sparse2 | + test_parse_ls_files_stage_oids | sort >expected && git -C r4 pack-objects --revs --stdout >all.pack <<-EOF && @@ -354,19 +352,19 @@ test_expect_success 'verify blob count in normal packfile' ' git -C r4 verify-pack -v ../all.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed ' test_expect_success 'verify sparse:oid=OID' ' - git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 | + test_parse_ls_files_stage_oids | sort >expected && git -C r4 ls-files -s pattern >staged && - oid=$(awk -f print_2.awk staged) && + oid=$(test_parse_ls_files_stage_oids filter.pack <<-EOF && HEAD EOF @@ -374,15 +372,15 @@ test_expect_success 'verify sparse:oid=OID' ' git -C r4 verify-pack -v ../filter.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed ' test_expect_success 'verify sparse:oid=oid-ish' ' - git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r4 ls-files -s dir1/sparse1 dir1/sparse2 | + test_parse_ls_files_stage_oids | sort >expected && git -C r4 pack-objects --revs --stdout --filter=sparse:oid=main:pattern >filter.pack <<-EOF && @@ -392,7 +390,7 @@ test_expect_success 'verify sparse:oid=oid-ish' ' git -C r4 verify-pack -v ../filter.pack >verify_result && grep blob verify_result | - awk -f print_1.awk | + parse_verify_pack_blob_oid | sort >observed && test_cmp expected observed @@ -402,9 +400,8 @@ test_expect_success 'verify sparse:oid=oid-ish' ' # This models previously omitted objects that we did not receive. test_expect_success 'setup r1 - delete loose blobs' ' - git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 \ - >ls_files_result && - awk -f print_2.awk ls_files_result | + git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 | + test_parse_ls_files_stage_oids | sort >expected && for id in `cat expected | sed "s|..|&/|"` diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index a027f0c409e..e6011409e2f 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1782,6 +1782,16 @@ test_oid_to_path () { echo "${1%$basename}/$basename" } +# Parse oids from git ls-files --staged output +test_parse_ls_files_stage_oids () { + awk '{print $2}' - +} + +# Parse oids from git ls-tree output +test_parse_ls_tree_oids () { + awk '{print $3}' - +} + # Choose a port number based on the test script's number and store it in # the given variable name, unless that variable already contains a number. test_set_port () { From patchwork Tue Mar 29 00:42:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A281C433FE for ; Tue, 29 Mar 2022 00:43:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231522AbiC2Aoo (ORCPT ); Mon, 28 Mar 2022 20:44:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231678AbiC2Ao1 (ORCPT ); Mon, 28 Mar 2022 20:44:27 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1683823D444 for ; Mon, 28 Mar 2022 17:42:45 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id l9-20020a05600c4f0900b0038ccd1b8642so519547wmq.0 for ; Mon, 28 Mar 2022 17:42:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=9wQouNJeYrigvBznmm6/Kim7Siohd+Q9tteLLEpJ8lg=; b=FDPT0PT9aqpF8y5PV8miO9DUzIU+jzjGS5dkcd4uYw6Cr9LlXOSkjzzLnqEZRncVLf Psf4TxwARytULDLgJKQU7KeySfneZGfR59Zgyu9u7yobaHkkgYH9wtt4MHZ8C1/CEue3 r69Q4pAtWFDszJ0DsnH5Bwig/iHKxd8LAyzkjbp+0AaMNAYtg+laEuT6q91BZfwSNAt3 WLJNRe5sK2YJQgppxTygdQI2EioeNsV7b3Ettn6IGOMc18Q3fV6D+yETqAnWViMBzogC EU+uhb/NEyrhgxKurjatiG8QHTdkMdVLPttjJiJ1a0qoJa+xCJLgmeUbr/ZLEkqrXS++ h0oQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=9wQouNJeYrigvBznmm6/Kim7Siohd+Q9tteLLEpJ8lg=; b=pO3JHiYhQhQLw25NZY81SLl5UAzuWHbtFnMVoX/cevGWcam8g/GX4HQNlYwmXQlbqc mjKMBjRjR0LqzYkAM3MSrpePgmzx5R8aJkA3PEz5+NqHMI3+4Ok6dTIVQ0jFLq6rvdE2 rzj0jmdyYAjM+WuPRsHQTR5O4BdKViGnfFFmDy53L1J+A3qVcKLgOJ3yTbrFZRCBAoOj FwAbXnO8EeVPsaLBWA27flRVrOi0lZyTJqVoYi/xpsdPSFt7Uo4lSA6Q2wKYf+SoJKT2 N3w67NpDvPJg97lEnoEsfIwoDlUT6cJ/nkkz2KnJ5RO1FaKw69ZXFEN1oEDjr/4cdKji pdbA== X-Gm-Message-State: AOAM533Lb/1MbRrpCtZXY8gqitP0dANXi0hS6wACB2m5yFZD88mE2Dpf E1JxxH5uQUGYfbJihzEXBUmOGASXMXU= X-Google-Smtp-Source: ABdhPJx+l6CDKZkGec9RHf7DEtBNiKPjguXIlx/WBS1TnAjNXeDTj+axAS3a9zdPQdbskc1unbSgFQ== X-Received: by 2002:a05:600c:2e45:b0:38c:8854:252f with SMTP id q5-20020a05600c2e4500b0038c8854252fmr2838742wmf.78.1648514563426; Mon, 28 Mar 2022 17:42:43 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a1-20020a056000188100b002041a652dfdsm14523546wri.25.2022.03.28.17.42.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:43 -0700 (PDT) Message-Id: <282fbdef792b7157804a4139dbc61106f31bbaef.1648514553.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:27 +0000 Subject: [PATCH v4 10/13] core.fsyncmethod: tests for batch mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Add test cases to exercise batch mode for: * 'git add' * 'git stash' * 'git update-index' * 'git unpack-objects' These tests ensure that the added data winds up in the object database. In this change we introduce a new test helper lib-unique-files.sh. The goal of this library is to create a tree of files that have different oids from any other files that may have been created in the current test repo. This helps us avoid missing validation of an object being added due to it already being in the repo. Signed-off-by: Neeraj Singh --- t/lib-unique-files.sh | 34 ++++++++++++++++++++++++++++++++++ t/t3700-add.sh | 28 ++++++++++++++++++++++++++++ t/t3903-stash.sh | 20 ++++++++++++++++++++ t/t5300-pack-object.sh | 41 +++++++++++++++++++++++++++-------------- 4 files changed, 109 insertions(+), 14 deletions(-) create mode 100644 t/lib-unique-files.sh diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh new file mode 100644 index 00000000000..34c01a65256 --- /dev/null +++ b/t/lib-unique-files.sh @@ -0,0 +1,34 @@ +# Helper to create files with unique contents + +# Create multiple files with unique contents within this test run. Takes the +# number of directories, the number of files in each directory, and the base +# directory. +# +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files +# each in my_dir, all with contents +# different from previous invocations +# of this command in this run. + +test_create_unique_files () { + test "$#" -ne 3 && BUG "3 param" + + local dirs="$1" && + local files="$2" && + local basedir="$3" && + local counter=0 && + local i && + local j && + test_tick && + local basedata=$basedir$test_tick && + rm -rf "$basedir" && + for i in $(test_seq $dirs) + do + local dir=$basedir/dir$i && + mkdir -p "$dir" && + for j in $(test_seq $files) + do + counter=$((counter + 1)) && + echo "$basedata.$counter">"$dir/file$j.txt" + done + done +} diff --git a/t/t3700-add.sh b/t/t3700-add.sh index b1f90ba3250..8979c8a5f03 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -8,6 +8,8 @@ test_description='Test of git add, including the -- option.' TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh +. $TEST_DIRECTORY/lib-unique-files.sh + # Test the file mode "$1" of the file "$2" in the index. test_mode_in_index () { case "$(git ls-files -s "$2")" in @@ -34,6 +36,32 @@ test_expect_success \ 'Test that "git add -- -q" works' \ 'touch -- -q && git add -- -q' +BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch' + +test_expect_success 'git add: core.fsyncmethod=batch' " + test_create_unique_files 2 4 files_base_dir1 && + GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION add -- ./files_base_dir1/ && + git ls-files --stage files_base_dir1/ | + test_parse_ls_files_stage_oids >added_files_oids && + + # We created 2 subdirs with 4 files each (8 files total) above + test_line_count = 8 added_files_oids && + git cat-file --batch-check='%(objectname)' added_files_actual && + test_cmp added_files_oids added_files_actual +" + +test_expect_success 'git update-index: core.fsyncmethod=batch' " + test_create_unique_files 2 4 files_base_dir2 && + find files_base_dir2 ! -type d -print | xargs git $BATCH_CONFIGURATION update-index --add -- && + git ls-files --stage files_base_dir2 | + test_parse_ls_files_stage_oids >added_files2_oids && + + # We created 2 subdirs with 4 files each (8 files total) above + test_line_count = 8 added_files2_oids && + git cat-file --batch-check='%(objectname)' added_files2_actual && + test_cmp added_files2_oids added_files2_actual +" + test_expect_success \ 'git add: Test that executable bit is not used if core.filemode=0' \ 'git config core.filemode 0 && diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index 4abbc8fccae..20e94881964 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -9,6 +9,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh +. $TEST_DIRECTORY/lib-unique-files.sh test_expect_success 'usage on cmd and subcommand invalid option' ' test_expect_code 129 git stash --invalid-option 2>usage && @@ -1410,6 +1411,25 @@ test_expect_success 'stash handles skip-worktree entries nicely' ' git rev-parse --verify refs/stash:A.t ' + +BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch' + +test_expect_success 'stash with core.fsyncmethod=batch' " + test_create_unique_files 2 4 files_base_dir && + GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION stash push -u -- ./files_base_dir/ && + + # The files were untracked, so use the third parent, + # which contains the untracked files + git ls-tree -r stash^3 -- ./files_base_dir/ | + test_parse_ls_tree_oids >stashed_files_oids && + + # We created 2 dirs with 4 files each (8 files total) above + test_line_count = 8 stashed_files_oids && + git cat-file --batch-check='%(objectname)' stashed_files_actual && + test_cmp stashed_files_oids stashed_files_actual +" + + test_expect_success 'git stash succeeds despite directory/file change' ' test_create_repo directory_file_switch_v1 && ( diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index a11d61206ad..f8a0f309e2d 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -161,22 +161,27 @@ test_expect_success 'pack-objects with bogus arguments' ' ' check_unpack () { + local packname="$1" && + local object_list="$2" && + local git_config="$3" && test_when_finished "rm -rf git2" && - git init --bare git2 && - git -C git2 unpack-objects -n <"$1".pack && - git -C git2 unpack-objects <"$1".pack && - (cd .git && find objects -type f -print) | - while read path - do - cmp git2/$path .git/$path || { - echo $path differs. - return 1 - } - done + git $git_config init --bare git2 && + ( + git $git_config -C git2 unpack-objects -n <"$packname".pack && + git $git_config -C git2 unpack-objects <"$packname".pack && + git $git_config -C git2 cat-file --batch-check="%(objectname)" + ) <"$object_list" >current && + cmp "$object_list" current } test_expect_success 'unpack without delta' ' - check_unpack test-1-${packname_1} + check_unpack test-1-${packname_1} obj-list +' + +BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch' + +test_expect_success 'unpack without delta (core.fsyncmethod=batch)' ' + check_unpack test-1-${packname_1} obj-list "$BATCH_CONFIGURATION" ' test_expect_success 'pack with REF_DELTA' ' @@ -185,7 +190,11 @@ test_expect_success 'pack with REF_DELTA' ' ' test_expect_success 'unpack with REF_DELTA' ' - check_unpack test-2-${packname_2} + check_unpack test-2-${packname_2} obj-list +' + +test_expect_success 'unpack with REF_DELTA (core.fsyncmethod=batch)' ' + check_unpack test-2-${packname_2} obj-list "$BATCH_CONFIGURATION" ' test_expect_success 'pack with OFS_DELTA' ' @@ -195,7 +204,11 @@ test_expect_success 'pack with OFS_DELTA' ' ' test_expect_success 'unpack with OFS_DELTA' ' - check_unpack test-3-${packname_3} + check_unpack test-3-${packname_3} obj-list +' + +test_expect_success 'unpack with OFS_DELTA (core.fsyncmethod=batch)' ' + check_unpack test-3-${packname_3} obj-list "$BATCH_CONFIGURATION" ' test_expect_success 'compare delta flavors' ' From patchwork Tue Mar 29 00:42:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05BFAC433EF for ; Tue, 29 Mar 2022 00:43:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231594AbiC2Aor (ORCPT ); Mon, 28 Mar 2022 20:44:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231682AbiC2Ao2 (ORCPT ); Mon, 28 Mar 2022 20:44:28 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA37B23D470 for ; Mon, 28 Mar 2022 17:42:45 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id w4so22558977wrg.12 for ; Mon, 28 Mar 2022 17:42:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=DGQ9vb1yDIhgR+KfEeN5nvs2ec9d8lwUbrdqGRhjbxo=; b=jXLB7NRKPq49h17/Yc+UJEtSQ5XpXVM1XgCkdXiQHvsYd66Y9kMg2F4NFGMIix8RPM AGdyZNovs8b5vy0CXgH9XnhYD2y1ftcopi1r0N44QODKTF9oXUr0Plw8SvdcKUO6q+2F AECKat1KqJsow6NUcO9cAZR6cIIygUmoMifijQ3HvPwvmN51lnsD55fcgJpKIYzG9CyU J3PCqs1jUXGa0O1TEY2HyZArxCpTnGR1bM0a0QZdje18Fhik1bx1OP9O0jIGgj6GMN5N HNxJvTntnXNajj0CJOSQFJdls2VVTtRRLg1zSFJuO6qdjhbFPcQSU8rXCzrVzLjxTlmh AlEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=DGQ9vb1yDIhgR+KfEeN5nvs2ec9d8lwUbrdqGRhjbxo=; b=UoJjeFe0z4XpsDri9slHvFfvwZ5hmlVq7Fb/Jf3++hWWmIiEdbB0395HGdhh2HUG/8 4UNQTLGtyXGE/7YYxmZ+CKnNyEBewXQMVlE5FW9MG0cPRCk7IcPbmAeMmPSiYSLpBFzu x6Ve+qxMM0FhXxkf8vroy3eM+1rN8rvQPa0/tbnmizgfjxvD4cRTLV8YVqtNdfDgfizz OzlPbXaw2zvPT9JSb26hgimoRns61N90HrUEibtIQoV2gDZE0pPUkjrdinm1GAlPXoeL p4udjvrJDvICXdh/QwqREnfxq1W9G/35viViaoJUx7EN1fGxKX2hy05rwU8IXtIaihXO W0Vg== X-Gm-Message-State: AOAM531088M8Cp1mpOkEaATLOtgUTcaktyE7Qn4dCwyl7BbGXp7gNjEU H8GXzuyePVC7FTji2JHswCmPYQFqU3o= X-Google-Smtp-Source: ABdhPJzaPYPSy63J/Dg7Jn9Xa11CSMQnen9TEQnCi/mPxYVwHSnN1xvKskruA5ewlOGs5db8wY/axA== X-Received: by 2002:a5d:47c5:0:b0:205:9248:817a with SMTP id o5-20020a5d47c5000000b002059248817amr25998940wrc.718.1648514564200; Mon, 28 Mar 2022 17:42:44 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e14-20020a5d65ce000000b00205c0cb33e0sm4568035wrw.35.2022.03.28.17.42.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:43 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:28 +0000 Subject: [PATCH v4 11/13] t/perf: add iteration setup mechanism to perf-lib Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Tests that affect the repo in stateful ways are easier to write if we can run setup steps outside of the measured portion of perf iteration. This change adds a "--setup 'setup-script'" parameter to test_perf. To make invocations easier to understand, I also moved the prerequisites to a new --prereq parameter. The setup facility will be used in the upcoming perf tests for batch mode, but it already helps in some existing tests, like t5302 and t7820. Signed-off-by: Neeraj Singh --- t/perf/p4220-log-grep-engines.sh | 3 +- t/perf/p4221-log-grep-engines-fixed.sh | 3 +- t/perf/p5302-pack-index.sh | 15 +++---- t/perf/p7519-fsmonitor.sh | 18 ++------ t/perf/p7820-grep-engines.sh | 6 ++- t/perf/perf-lib.sh | 62 +++++++++++++++++++++++--- 6 files changed, 73 insertions(+), 34 deletions(-) diff --git a/t/perf/p4220-log-grep-engines.sh b/t/perf/p4220-log-grep-engines.sh index 2bc47ded4d1..03fbfbb85d3 100755 --- a/t/perf/p4220-log-grep-engines.sh +++ b/t/perf/p4220-log-grep-engines.sh @@ -36,7 +36,8 @@ do else prereq="" fi - test_perf $prereq "$engine log$GIT_PERF_4220_LOG_OPTS --grep='$pattern'" " + test_perf "$engine log$GIT_PERF_4220_LOG_OPTS --grep='$pattern'" \ + --prereq "$prereq" " git -c grep.patternType=$engine log --pretty=format:%h$GIT_PERF_4220_LOG_OPTS --grep='$pattern' >'out.$engine' || : " done diff --git a/t/perf/p4221-log-grep-engines-fixed.sh b/t/perf/p4221-log-grep-engines-fixed.sh index 060971265a9..0a6d6dfc219 100755 --- a/t/perf/p4221-log-grep-engines-fixed.sh +++ b/t/perf/p4221-log-grep-engines-fixed.sh @@ -26,7 +26,8 @@ do else prereq="" fi - test_perf $prereq "$engine log$GIT_PERF_4221_LOG_OPTS --grep='$pattern'" " + test_perf "$engine log$GIT_PERF_4221_LOG_OPTS --grep='$pattern'" \ + --prereq "$prereq" " git -c grep.patternType=$engine log --pretty=format:%h$GIT_PERF_4221_LOG_OPTS --grep='$pattern' >'out.$engine' || : " done diff --git a/t/perf/p5302-pack-index.sh b/t/perf/p5302-pack-index.sh index c16f6a3ff69..14c601bbf86 100755 --- a/t/perf/p5302-pack-index.sh +++ b/t/perf/p5302-pack-index.sh @@ -26,9 +26,8 @@ test_expect_success 'set up thread-counting tests' ' done ' -test_perf PERF_EXTRA 'index-pack 0 threads' ' - rm -rf repo.git && - git init --bare repo.git && +test_perf 'index-pack 0 threads' --prereq PERF_EXTRA \ + --setup 'rm -rf repo.git && git init --bare repo.git' ' GIT_DIR=repo.git git index-pack --threads=1 --stdin < $PACK ' @@ -36,17 +35,15 @@ for t in $threads do THREADS=$t export THREADS - test_perf PERF_EXTRA "index-pack $t threads" ' - rm -rf repo.git && - git init --bare repo.git && + test_perf "index-pack $t threads" --prereq PERF_EXTRA \ + --setup 'rm -rf repo.git && git init --bare repo.git' ' GIT_DIR=repo.git GIT_FORCE_THREADS=1 \ git index-pack --threads=$THREADS --stdin <$PACK ' done -test_perf 'index-pack default number of threads' ' - rm -rf repo.git && - git init --bare repo.git && +test_perf 'index-pack default number of threads' \ + --setup 'rm -rf repo.git && git init --bare repo.git' ' GIT_DIR=repo.git git index-pack --stdin < $PACK ' diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index c8be58f3c76..5b489c968b8 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -60,18 +60,6 @@ then esac fi -if test -n "$GIT_PERF_7519_DROP_CACHE" -then - # When using GIT_PERF_7519_DROP_CACHE, GIT_PERF_REPEAT_COUNT must be 1 to - # generate valid results. Otherwise the caching that happens for the nth - # run will negate the validity of the comparisons. - if test "$GIT_PERF_REPEAT_COUNT" -ne 1 - then - echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2 - GIT_PERF_REPEAT_COUNT=1 - fi -fi - trace_start() { if test -n "$GIT_PERF_7519_TRACE" then @@ -167,10 +155,10 @@ setup_for_fsmonitor() { test_perf_w_drop_caches () { if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches + test_perf "$1" --setup "test-tool drop-caches" "$2" + else + test_perf "$@" fi - - test_perf "$@" } test_fsmonitor_suite() { diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh index 8b09c5bf328..9bfb86842a9 100755 --- a/t/perf/p7820-grep-engines.sh +++ b/t/perf/p7820-grep-engines.sh @@ -49,13 +49,15 @@ do fi if ! test_have_prereq PERF_GREP_ENGINES_THREADS then - test_perf $prereq "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern'" " + test_perf "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern'" \ + --prereq "$prereq" " git -c grep.patternType=$engine grep$GIT_PERF_7820_GREP_OPTS -- '$pattern' >'out.$engine' || : " else for threads in $GIT_PERF_GREP_THREADS do - test_perf PTHREADS,$prereq "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern' with $threads threads" " + test_perf "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern' with $threads threads" + --prereq PTHREADS,$prereq " git -c grep.patternType=$engine -c grep.threads=$threads grep$GIT_PERF_7820_GREP_OPTS -- '$pattern' >'out.$engine.$threads' || : " done diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh index 407252bac70..a935ad622d3 100644 --- a/t/perf/perf-lib.sh +++ b/t/perf/perf-lib.sh @@ -189,19 +189,38 @@ exit $ret' >&3 2>&4 } test_wrapper_ () { - test_wrapper_func_=$1; shift + local test_wrapper_func_=$1; shift + local test_title_=$1; shift test_start_ - test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq= - test "$#" = 2 || - BUG "not 2 or 3 parameters to test-expect-success" + test_prereq= + test_perf_setup_= + while test $# != 0 + do + case $1 in + --prereq) + test_prereq=$2 + shift + ;; + --setup) + test_perf_setup_=$2 + shift + ;; + *) + break + ;; + esac + shift + done + test "$#" = 1 || BUG "test_wrapper_ needs 2 positional parameters" export test_prereq - if ! test_skip "$@" + export test_perf_setup_ + if ! test_skip "$test_title_" "$@" then base=$(basename "$0" .sh) echo "$test_count" >>"$perf_results_dir"/$base.subtests echo "$1" >"$perf_results_dir"/$base.$test_count.descr base="$perf_results_dir"/"$PERF_RESULTS_PREFIX$(basename "$0" .sh)"."$test_count" - "$test_wrapper_func_" "$@" + "$test_wrapper_func_" "$test_title_" "$@" fi test_finish_ @@ -214,6 +233,16 @@ test_perf_ () { echo "perf $test_count - $1:" fi for i in $(test_seq 1 $GIT_PERF_REPEAT_COUNT); do + if test -n "$test_perf_setup_" + then + say >&3 "setup: $test_perf_setup_" + if ! test_eval_ $test_perf_setup_ + then + test_failure_ "$test_perf_setup_" + break + fi + + fi say >&3 "running: $2" if test_run_perf_ "$2" then @@ -237,11 +266,24 @@ test_perf_ () { rm test_time.* } +# Usage: test_perf 'title' [options] 'perf-test' +# Run the performance test script specified in perf-test with +# optional prerequisite and setup steps. +# Options: +# --prereq prerequisites: Skip the test if prequisites aren't met +# --setup "setup-steps": Run setup steps prior to each measured iteration +# test_perf () { test_wrapper_ test_perf_ "$@" } test_size_ () { + if test -n "$test_perf_setup_" + then + say >&3 "setup: $test_perf_setup_" + test_eval_ $test_perf_setup_ + fi + say >&3 "running: $2" if test_eval_ "$2" 3>"$base".result; then test_ok_ "$1" @@ -250,6 +292,14 @@ test_size_ () { fi } +# Usage: test_size 'title' [options] 'size-test' +# Run the size test script specified in size-test with optional +# prerequisites and setup steps. Returns the numeric value +# returned by size-test. +# Options: +# --prereq prerequisites: Skip the test if prequisites aren't met +# --setup "setup-steps": Run setup steps prior to the size measurement + test_size () { test_wrapper_ test_size_ "$@" } From patchwork Tue Mar 29 00:42:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2793DC433F5 for ; Tue, 29 Mar 2022 00:43:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231464AbiC2Aom (ORCPT ); Mon, 28 Mar 2022 20:44:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231684AbiC2Ao3 (ORCPT ); Mon, 28 Mar 2022 20:44:29 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91B3523D595 for ; Mon, 28 Mar 2022 17:42:46 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id u16so22608474wru.4 for ; Mon, 28 Mar 2022 17:42:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=a7p2DJpF/8OqyPWu9AK9ER57ip84thiO0tcgOwjKq/8=; b=UaOvOGo+Outp+b0MLN3yWRh8OVkPD6Mg1Wf0++oKC8neSQi8SdNWRa+SjHT+v0Lbc/ ttYIkHJn+zOp3V7T+hNNYc4Ou1G6qqpLXSfkVQQdSlURfbVKqUyweKUmvCKbnNuMlOQU 2OXm2aJn/1ve4bS5cG9rgwd/fgExhugsdDbsKFOEJYEtSwaOrGz2YthZpIOKgBVwq2yz O4HMl3uaWLPLHzT8lhaWPwAMDZOA1CI/8S+Btm9B32CSlP1gS0ZeOkuj3MFcSf2IyjmJ B9Pu/y4MtnbOL1KveQPv2GgRkaq8rSAO1NY0T+LVf7PubI29h1hL5yK4+zNpQ++9JgTQ +lfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=a7p2DJpF/8OqyPWu9AK9ER57ip84thiO0tcgOwjKq/8=; b=n80iQ66J++eh7zfozvqq2w0OX3qdrCeL+IVGFwd74HXwkPtC+346LRmJBmQmHMKWAY 8I8JdDSOfUBQl+uT2Al/t/B+XiFqs2Y9tIHAt1syi4Jr4LBS53v1RKHvIaSjyrFpVwvo n4sUDVf7z0W5kJg0rnMnBNqEWBXOnHT6uydqXEnFaYOalhdQWVF5chPmmK1wOSxNfCTb iXZv0fqdVQ61mIpOQySzvvCQ4M2sJ/bRrhY2j2hKoyog6XJfU++AL9k8yc8JlSJw32Wt L3HlSP6Jc3N285+1FUy57Bzt8FNk6DGR4YYQNsKsTnm77A36pkDT6oA4RanCUwn4FR4d fQwQ== X-Gm-Message-State: AOAM530Qlcz5NDXiWPl7Uuvyjv4EIlYHXNNI5mkebuwEWQM1cCXzstV7 1Fd6qeVEHVuHbdJJlprREU3cZlfTPjc= X-Google-Smtp-Source: ABdhPJznK39knALDQ1eOeJHEPiFKN47ytw17eqXtLPqi23FgV9sDsL5qYOeNQFTp3wGrjGtvOD2d/Q== X-Received: by 2002:a05:6000:168e:b0:204:f90:de02 with SMTP id y14-20020a056000168e00b002040f90de02mr27629126wrd.108.1648514565094; Mon, 28 Mar 2022 17:42:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j16-20020a05600c191000b0038ca3500494sm1161723wmq.27.2022.03.28.17.42.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:44 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:29 +0000 Subject: [PATCH v4 12/13] core.fsyncmethod: performance tests for add and stash Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Add basic performance tests for "git add" and "git stash" of a lot of new objects with various fsync settings. This shows the benefit of batch mode relative to full fsync. Signed-off-by: Neeraj Singh --- t/perf/p3700-add.sh | 59 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+) create mode 100755 t/perf/p3700-add.sh diff --git a/t/perf/p3700-add.sh b/t/perf/p3700-add.sh new file mode 100755 index 00000000000..ef6024f9897 --- /dev/null +++ b/t/perf/p3700-add.sh @@ -0,0 +1,59 @@ +#!/bin/sh +# +# This test measures the performance of adding new files to the object database +# and index. The test was originally added to measure the effect of the +# core.fsyncMethod=batch mode, which is why we are testing different values +# of that setting explicitly and creating a lot of unique objects. + +test_description="Tests performance of adding things to the object database" + +# Fsync is normally turned off for the test suite. +GIT_TEST_FSYNC=1 +export GIT_TEST_FSYNC + +. ./perf-lib.sh + +. $TEST_DIRECTORY/lib-unique-files.sh + +test_perf_fresh_repo +test_checkout_worktree + +dir_count=10 +files_per_dir=50 +total_files=$((dir_count * files_per_dir)) + +for mode in false true batch +do + case $mode in + false) + FSYNC_CONFIG='-c core.fsync=-loose-object -c core.fsyncmethod=fsync' + ;; + true) + FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=fsync' + ;; + batch) + FSYNC_CONFIG='-c core.fsync=loose-object -c core.fsyncmethod=batch' + ;; + esac + + test_perf "add $total_files files (object_fsyncing=$mode)" \ + --setup " + (rm -rf .git || 1) && + git init && + test_create_unique_files $dir_count $files_per_dir files_$mode + " " + git $FSYNC_CONFIG add files_$mode + " + + test_perf "stash $total_files files (object_fsyncing=$mode)" \ + --setup " + (rm -rf .git || 1) && + git init && + test_commit first && + test_create_unique_files $dir_count $files_per_dir stash_files_$mode + " " + git $FSYNC_CONFIG stash push -u -- stash_files_$mode + " +done + +test_done From patchwork Tue Mar 29 00:42:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12794354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1805C433F5 for ; Tue, 29 Mar 2022 00:43:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231499AbiC2Aor (ORCPT ); Mon, 28 Mar 2022 20:44:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231651AbiC2Aoi (ORCPT ); Mon, 28 Mar 2022 20:44:38 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 740AB23D5B7 for ; Mon, 28 Mar 2022 17:42:47 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id a1so22585976wrh.10 for ; Mon, 28 Mar 2022 17:42:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/dJLnMNPf593DDA+ox52qw4qYfm4mKhrA2SM6fu1eys=; b=NKjoEGPpy6A19aPx/7Mf9IQrxPL+mZqYdLELJK5EVA6io6Etn2qSieGtjQ5JBBRNL8 LIcm9X8YTGFXESonmJ6OVKDWbMxCLtEQM357Uo3pPybIhiIazUiW+jEuw/Ze3UWYq+t4 eg/T6zasww5xt8Fwqok9tGgMHEpnkgZ8kabvoYHOmzrBfywgaGjt+FKgLPGwrBdpxJxU AdahylYziw+ybTacYCHxIMeTsApiBrPXuuX8fCOWqzQEdN4r2HIcJn3/+Y7saWnn3/pm Qfs7tAkfL5YVnf0GllEOPVXj8H98RUg5V7eMeiPPWhh9HI38iJIG3eLV2y+bpeQ4OyqE +kfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/dJLnMNPf593DDA+ox52qw4qYfm4mKhrA2SM6fu1eys=; b=uHgFdLKrE3D3rGbWj1lIc1EDcSK4c2c5jjHj6SvFC67Tq6FGDDHSjUIvxBxQSEYQ2L ruuPc5aj64UFxjE9fz0nfdAqnYQeWVvePCSdqJY4b4D8O3Kqc67m2hntHk+ypFO95acp f748rurxnDOyl+JOrs57dgyS12/d1BYvLsMIHuqmTTgItxrnxeoTTd8YVaSsDp2oubsP pQO8BCwBGGbDq77CjiiMHxOQc/P4rsB7Dz9L+xUiF44ErDY1PLlRtJc6/n+Zk4xmEzQm IH3ZmypGHm+uT0XMLo95lzB6l61JyT/ZfT4ta/kD8gWac2tbMAulLQi3F4XADZvYW8cx DNlA== X-Gm-Message-State: AOAM533goa09pRVP8vwkXX2OLrfnYU51tXaJXBZs6/KMNcS6+sjPtLBS OUf/uml/PalR31j1NOAEq9NKoAqk/To= X-Google-Smtp-Source: ABdhPJxoYZVYUVr+UTSjWkLrV7SobpR7Rh/1Ne/XFWg/vB+FtBy5yib5518Y82Tm0QjxOq+crj6kxg== X-Received: by 2002:adf:ed50:0:b0:203:da73:e0fd with SMTP id u16-20020adfed50000000b00203da73e0fdmr27923895wro.516.1648514565853; Mon, 28 Mar 2022 17:42:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o4-20020a5d6484000000b002057ad822d4sm14246163wri.48.2022.03.28.17.42.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 17:42:45 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 29 Mar 2022 00:42:30 +0000 Subject: [PATCH v4 13/13] core.fsyncmethod: correctly camel-case warning message Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com, nksingh85@gmail.com, ps@pks.im, jeffhost@microsoft.com, Bagas Sanjaya , worldhello.net@gmail.com, "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh The warning for an unrecognized fsyncMethod was not camel-cased. Signed-off-by: Neeraj Singh --- config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.c b/config.c index e9cac5f4707..ae819dee20b 100644 --- a/config.c +++ b/config.c @@ -1697,7 +1697,7 @@ static int git_default_core_config(const char *var, const char *value, void *cb) if (!strcmp(var, "core.fsyncobjectfiles")) { if (fsync_object_files < 0) - warning(_("core.fsyncobjectfiles is deprecated; use core.fsync instead")); + warning(_("core.fsyncObjectFiles is deprecated; use core.fsync instead")); fsync_object_files = git_config_bool(var, value); return 0; }