From patchwork Tue Sep 22 22:49:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793499 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C902E59D for ; Tue, 22 Sep 2020 22:50:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A75DD2076E for ; Tue, 22 Sep 2020 22:50:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="MvLU/XSc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726749AbgIVWud (ORCPT ); Tue, 22 Sep 2020 18:50:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726637AbgIVWud (ORCPT ); Tue, 22 Sep 2020 18:50:33 -0400 Received: from mail-qt1-x844.google.com (mail-qt1-x844.google.com [IPv6:2607:f8b0:4864:20::844]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70F46C061755 for ; Tue, 22 Sep 2020 15:50:33 -0700 (PDT) Received: by mail-qt1-x844.google.com with SMTP id e7so17036427qtj.11 for ; Tue, 22 Sep 2020 15:50:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=c+M8Jp93lWrJDq5UT7sko6pi28WFvhePCh9ZafQmKs0=; b=MvLU/XSc4PYG2Ha21q3bWGdZLiS3UugcdaUPWQ2UdxxFMsHBzjZa9JcHxTov2zoKUb 3QWq13onYGw34XAPwbasFCOjT7jZYKLDOa8vgA0oi+udSdJSqf5OEBH462NbsydT8Ff8 2BPAz02Zy2redzrM54/plN7tmmJnExdeE6GyCxA4NXccYe6ER2A6qzqqZwIj2D8F9bM2 RXibULWTZE9IJCYInzyytYmCq9AzOQ/yqsDZ9pprsHumvpTtwMM0j3N+pDMbWV0FU4X5 BI8UY+wkdvwm42ifuIuzICVwrMDTEV7bNwN/rVSnPYgpAmapiE+aYB5ivRyyfZ6YNa3C ntTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=c+M8Jp93lWrJDq5UT7sko6pi28WFvhePCh9ZafQmKs0=; b=tofaF6rRsXDI/fxu30ilyza7WgBZJ5pKxTOfgR8KiCipbHYP8wxcdbooinjbl6xUTc l0o0Gto/y3D6gqZfa5SP+H8cEROPY1M9I/zijNGVfnZDZ0ycbpWlYOhqyPQE1ogvfLlE gsWSTRXhmfUxCs9IJUCMRSygPVPvRNwEVf1KMJhWa8P+5furwPGWzd5yCjhrsgpiYxgg PwTSiT9LCnsQ+XlbxN5dVGM/m3Z2992qcbs0mZDkR/H6zNUkwofARkwzEQuOsANlmCCL webLqhU7yzFQSYFVaecmT8XnWCDndViI+JXGsjP7D4QP6lv9uxpwCPXkjaBVB6x/Cbub tmxw== X-Gm-Message-State: AOAM530pNQa9XwqLNJxFZ1Ig8XsDjY0dBU9PHxs82CexnDtNN7aGr5nX wu2fgaxWGLmft9GXgI3vF+Udm/Hg86yb2Q== X-Google-Smtp-Source: ABdhPJzJOTq20igwcXNF8wCW3EFEN0ax5nEZZeLsVy8WZrwp6NhmLPaj9xdApk53xCMU7Ab9SdujrA== X-Received: by 2002:ac8:2934:: with SMTP id y49mr2396968qty.202.1600815030476; Tue, 22 Sep 2020 15:50:30 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:29 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 01/19] convert: make convert_attrs() and convert structs public Date: Tue, 22 Sep 2020 19:49:15 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler Move convert_attrs() declaration from convert.c to convert.h, together with the conv_attrs struct and the crlf_action enum. This function and the data structures will be used outside convert.c in the upcoming parallel checkout implementation. Signed-off-by: Jeff Hostetler [matheus.bernardino: squash and reword msg] Signed-off-by: Matheus Tavares --- convert.c | 23 ++--------------------- convert.h | 24 ++++++++++++++++++++++++ 2 files changed, 26 insertions(+), 21 deletions(-) diff --git a/convert.c b/convert.c index 8e6c292421..941a845692 100644 --- a/convert.c +++ b/convert.c @@ -24,17 +24,6 @@ #define CONVERT_STAT_BITS_TXT_CRLF 0x2 #define CONVERT_STAT_BITS_BIN 0x4 -enum crlf_action { - CRLF_UNDEFINED, - CRLF_BINARY, - CRLF_TEXT, - CRLF_TEXT_INPUT, - CRLF_TEXT_CRLF, - CRLF_AUTO, - CRLF_AUTO_INPUT, - CRLF_AUTO_CRLF -}; - struct text_stat { /* NUL, CR, LF and CRLF counts */ unsigned nul, lonecr, lonelf, crlf; @@ -1297,18 +1286,10 @@ static int git_path_check_ident(struct attr_check_item *check) return !!ATTR_TRUE(value); } -struct conv_attrs { - struct convert_driver *drv; - enum crlf_action attr_action; /* What attr says */ - enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */ - int ident; - const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */ -}; - static struct attr_check *check; -static void convert_attrs(const struct index_state *istate, - struct conv_attrs *ca, const char *path) +void convert_attrs(const struct index_state *istate, + struct conv_attrs *ca, const char *path) { struct attr_check_item *ccheck = NULL; diff --git a/convert.h b/convert.h index e29d1026a6..aeb4a1be9a 100644 --- a/convert.h +++ b/convert.h @@ -37,6 +37,27 @@ enum eol { #endif }; +enum crlf_action { + CRLF_UNDEFINED, + CRLF_BINARY, + CRLF_TEXT, + CRLF_TEXT_INPUT, + CRLF_TEXT_CRLF, + CRLF_AUTO, + CRLF_AUTO_INPUT, + CRLF_AUTO_CRLF +}; + +struct convert_driver; + +struct conv_attrs { + struct convert_driver *drv; + enum crlf_action attr_action; /* What attr says */ + enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */ + int ident; + const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */ +}; + enum ce_delay_state { CE_NO_DELAY = 0, CE_CAN_DELAY = 1, @@ -102,6 +123,9 @@ void convert_to_git_filter_fd(const struct index_state *istate, int would_convert_to_git_filter_fd(const struct index_state *istate, const char *path); +void convert_attrs(const struct index_state *istate, + struct conv_attrs *ca, const char *path); + /* * Initialize the checkout metadata with the given values. Any argument may be * NULL if it is not applicable. The treeish should be a commit if that is From patchwork Tue Sep 22 22:49:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793503 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8756459D for ; Tue, 22 Sep 2020 22:50:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 64C872076E for ; Tue, 22 Sep 2020 22:50:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="aVk5ui9t" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726776AbgIVWun (ORCPT ); Tue, 22 Sep 2020 18:50:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726637AbgIVWum (ORCPT ); Tue, 22 Sep 2020 18:50:42 -0400 Received: from mail-qk1-x744.google.com (mail-qk1-x744.google.com [IPv6:2607:f8b0:4864:20::744]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F13EC061755 for ; Tue, 22 Sep 2020 15:50:42 -0700 (PDT) Received: by mail-qk1-x744.google.com with SMTP id o5so20887567qke.12 for ; Tue, 22 Sep 2020 15:50:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EVKO0TJCnvOAmfwtGERytuGoYqOpfi8BH4ktAy+W+QI=; b=aVk5ui9tTKVicpMIHrstOXOwH9EJhHqF9x4+QfOq76jZhLlK3UtzsZiFbdFH0h9S3P eMoNW4Etqnbl0xAUxFoNUbd3MgXI0wueCYvIXLc1FiuR4orYySc4biUqvYH/2M62vZnX PjCuwwp2d8iwiu4B7H//tdZnLAP2zq+td5uqaQmnN65FvbxjkdMA0tOlfdXQB88zvnLJ KkszCM1arAY754KcaSHXNVPvPgWYbRO79xeuGAOa60iJeveb8dKGqbkjtxI7fgwzkKwZ l8R9qmjyalbSDdbi659D+RtIuQs9z+h5f007agU49bulc/XlBLeoKRGczgcMtqxbJ1oX Abjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EVKO0TJCnvOAmfwtGERytuGoYqOpfi8BH4ktAy+W+QI=; b=VHrWKy5oRmY7Lm6EWAuN9v8VC0uHM3bQ11TnXDu4VwbCa5tvnjq6uOPcNIS3DvOx9K V4yGmuxoh1ENy7qQ8DRv/QnJI7NKg6IiQjEE4/7Y3ku/Ti9l2lw0WCU3rAzre+vDt7rr +C6b6cSML0VebqR1CbmHtlMf3v6CA6qUqkD1AOoMkpfjZF7v2WXA9MPH9UyhyBNKPSkd yk8hnkrr+Lps7MHw96kQwrWJq43z8X/T7k2xjreA0CBVYYLRbjDgI7t2zbVD0imSTeD7 zSpqGcKCOMPuJfdlbfzrzKE6tvvF0yfizHUD8s+qWkSUfXg65q1KtVLCoPLBZDTUqcpj Sjlw== X-Gm-Message-State: AOAM532mJ7bcT6rZOrTh9pty5p3mQk7KF74wHt0LXWoE41LAYEhdyjmS Fe+nzVfeqy8LWbR1nVb/O70RHpQ7Pq/TkQ== X-Google-Smtp-Source: ABdhPJywm4yAd9VCuY6kx5T6r4cOqYgN7lhgGpYEJr/efLBS5iTDf2V7N0VyUif9mzukB+luzQgYEA== X-Received: by 2002:ae9:e8c5:: with SMTP id a188mr7293281qkg.204.1600815035847; Tue, 22 Sep 2020 15:50:35 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:34 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 02/19] convert: add [async_]convert_to_working_tree_ca() variants Date: Tue, 22 Sep 2020 19:49:16 -0300 Message-Id: <313c3bcbebb9460d62cc29692964111565043de0.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler Separate the attribute gathering from the actual conversion by adding _ca() variants of the conversion functions. These variants receive a precomputed 'struct conv_attrs', not relying, thus, on a index state. They will be used in a future patch adding parallel checkout support, for two reasons: - We will already load the conversion attributes in checkout_entry(), before conversion, to decide whether a path is eligible for parallel checkout. Therefore, it would be wasteful to load them again later, for the actual conversion. - The parallel workers will be responsible for reading, converting and writing blobs to the working tree. They won't have access to the main process' index state, so they cannot load the attributes. Instead, they will receive the preloaded ones and call the _ca() variant of the conversion functions. Furthermore, the attributes machinery is optimized to handle paths in sequential order, so it's better to leave it for the main process, anyway. Signed-off-by: Jeff Hostetler [matheus.bernardino: squash, remove one function definition and reword] Signed-off-by: Matheus Tavares --- convert.c | 50 ++++++++++++++++++++++++++++++++++++-------------- convert.h | 9 +++++++++ 2 files changed, 45 insertions(+), 14 deletions(-) diff --git a/convert.c b/convert.c index 941a845692..55bcce891c 100644 --- a/convert.c +++ b/convert.c @@ -1447,7 +1447,7 @@ void convert_to_git_filter_fd(const struct index_state *istate, ident_to_git(dst->buf, dst->len, dst, ca.ident); } -static int convert_to_working_tree_internal(const struct index_state *istate, +static int convert_to_working_tree_internal(const struct conv_attrs *ca, const char *path, const char *src, size_t len, struct strbuf *dst, int normalizing, @@ -1455,11 +1455,8 @@ static int convert_to_working_tree_internal(const struct index_state *istate, struct delayed_checkout *dco) { int ret = 0, ret_filter = 0; - struct conv_attrs ca; - - convert_attrs(istate, &ca, path); - ret |= ident_to_worktree(src, len, dst, ca.ident); + ret |= ident_to_worktree(src, len, dst, ca->ident); if (ret) { src = dst->buf; len = dst->len; @@ -1469,24 +1466,24 @@ static int convert_to_working_tree_internal(const struct index_state *istate, * is a smudge or process filter (even if the process filter doesn't * support smudge). The filters might expect CRLFs. */ - if ((ca.drv && (ca.drv->smudge || ca.drv->process)) || !normalizing) { - ret |= crlf_to_worktree(src, len, dst, ca.crlf_action); + if ((ca->drv && (ca->drv->smudge || ca->drv->process)) || !normalizing) { + ret |= crlf_to_worktree(src, len, dst, ca->crlf_action); if (ret) { src = dst->buf; len = dst->len; } } - ret |= encode_to_worktree(path, src, len, dst, ca.working_tree_encoding); + ret |= encode_to_worktree(path, src, len, dst, ca->working_tree_encoding); if (ret) { src = dst->buf; len = dst->len; } ret_filter = apply_filter( - path, src, len, -1, dst, ca.drv, CAP_SMUDGE, meta, dco); - if (!ret_filter && ca.drv && ca.drv->required) - die(_("%s: smudge filter %s failed"), path, ca.drv->name); + path, src, len, -1, dst, ca->drv, CAP_SMUDGE, meta, dco); + if (!ret_filter && ca->drv && ca->drv->required) + die(_("%s: smudge filter %s failed"), path, ca->drv->name); return ret | ret_filter; } @@ -1497,7 +1494,9 @@ int async_convert_to_working_tree(const struct index_state *istate, const struct checkout_metadata *meta, void *dco) { - return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco); + struct conv_attrs ca; + convert_attrs(istate, &ca, path); + return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, dco); } int convert_to_working_tree(const struct index_state *istate, @@ -1505,13 +1504,36 @@ int convert_to_working_tree(const struct index_state *istate, size_t len, struct strbuf *dst, const struct checkout_metadata *meta) { - return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL); + struct conv_attrs ca; + convert_attrs(istate, &ca, path); + return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, NULL); +} + +int async_convert_to_working_tree_ca(const struct conv_attrs *ca, + const char *path, const char *src, + size_t len, struct strbuf *dst, + const struct checkout_metadata *meta, + void *dco) +{ + return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco); +} + +int convert_to_working_tree_ca(const struct conv_attrs *ca, + const char *path, const char *src, + size_t len, struct strbuf *dst, + const struct checkout_metadata *meta) +{ + return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL); } int renormalize_buffer(const struct index_state *istate, const char *path, const char *src, size_t len, struct strbuf *dst) { - int ret = convert_to_working_tree_internal(istate, path, src, len, dst, 1, NULL, NULL); + struct conv_attrs ca; + int ret; + + convert_attrs(istate, &ca, path); + ret = convert_to_working_tree_internal(&ca, path, src, len, dst, 1, NULL, NULL); if (ret) { src = dst->buf; len = dst->len; diff --git a/convert.h b/convert.h index aeb4a1be9a..46d537d1ae 100644 --- a/convert.h +++ b/convert.h @@ -100,11 +100,20 @@ int convert_to_working_tree(const struct index_state *istate, const char *path, const char *src, size_t len, struct strbuf *dst, const struct checkout_metadata *meta); +int convert_to_working_tree_ca(const struct conv_attrs *ca, + const char *path, const char *src, + size_t len, struct strbuf *dst, + const struct checkout_metadata *meta); int async_convert_to_working_tree(const struct index_state *istate, const char *path, const char *src, size_t len, struct strbuf *dst, const struct checkout_metadata *meta, void *dco); +int async_convert_to_working_tree_ca(const struct conv_attrs *ca, + const char *path, const char *src, + size_t len, struct strbuf *dst, + const struct checkout_metadata *meta, + void *dco); int async_query_available_blobs(const char *cmd, struct string_list *available_paths); int renormalize_buffer(const struct index_state *istate, From patchwork Tue Sep 22 22:49:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793501 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1066C59D for ; Tue, 22 Sep 2020 22:50:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8BF22076E for ; Tue, 22 Sep 2020 22:50:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="ONFmlFx5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726761AbgIVWul (ORCPT ); Tue, 22 Sep 2020 18:50:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726637AbgIVWul (ORCPT ); Tue, 22 Sep 2020 18:50:41 -0400 Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [IPv6:2607:f8b0:4864:20::735]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB305C061755 for ; Tue, 22 Sep 2020 15:50:41 -0700 (PDT) Received: by mail-qk1-x735.google.com with SMTP id g72so20906296qke.8 for ; Tue, 22 Sep 2020 15:50:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UDZhLyY8O/bBF56bscc+8yXTxs+B4Uzf1MqZWEQZLp4=; b=ONFmlFx5Y8s69azLKVNnMMrwoCrpXei6BUmNxWqIUQ7wHm4z6Qrfy9xfBW3yZTEXUh 9al76BiJiNjWgLftwTNm+OH/u+bczZIFEXRN4qQT2dScAzjzQ1MG+BIIRoK3+bI4wYC+ ZZ4QRbatuTNMOXXd0RoWebOo0kGFigS//gtvvxipMr+6C3hTCLP/Dy4OqoVnlhCewZog A9ga1yKAu00wbndr1BvxmBsz9TgXIgigzk7I/VRDehKIkInUcxdnDDqdhlM2tK8vWABU VprueQ9kHic5qMmHJoGEu3NHTFOlwsZgG3lg18+p0KWmELWIMqP1i9hQsob7yL9EFvI+ Mpew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UDZhLyY8O/bBF56bscc+8yXTxs+B4Uzf1MqZWEQZLp4=; b=Rgw2Wl7pVCrhGxoJR7JJ+x1/oIm7oqWsSze3ieHjilXtAtR6+h7ymJlsOgZ4P3Xb4E +8MsRhj+8RHzUEKZaBpo+ddpsP5OjWsdV18H+PPgUSFWJcsANxBSzeOlBWp7nNwL9QuG EJHmkLlPoDQrdhL0kHAn+QV89PZQbUgwic4Ey+CHPld04PpAXwnxgjn+KY4UghkAdp69 +umV8S6gFt7IzM7tsWRnD7INxaoDtyj0ZcTS3Ma4U+nuWNcY+YYLzXGiEOYFAH2XKXNh VUDJN/imFohRlLVIJ+y47nzhJYN89rYsNng8AdLq7Q09wKWHx6L7yPfNqsY0kUFwvpwC einw== X-Gm-Message-State: AOAM5307WIViVoHUuAdE9qgm2UpgZF71bI0WBLmwmUM5ShhrSOH8OlcM I6G+FTO60TipaJxOh09o6W7j3eN8QYVILQ== X-Google-Smtp-Source: ABdhPJzAkFEfXEotXtPI28Z8hgGiaTFWmwleAR0XM/5CNONywZkCBDi1QXwxqaixQVhWBZkOVSNY7Q== X-Received: by 2002:a37:a207:: with SMTP id l7mr7432088qke.64.1600815040492; Tue, 22 Sep 2020 15:50:40 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:39 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 03/19] convert: add get_stream_filter_ca() variant Date: Tue, 22 Sep 2020 19:49:17 -0300 Message-Id: <29bbdb78e98667393f17e167f3d72d1bfbb302bd.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler Like the previous patch, we will also need to call get_stream_filter() with a precomputed `struct conv_attrs`, when we add support for parallel checkout workers. So add the _ca() variant which takes the conversion attributes struct as a parameter. Signed-off-by: Jeff Hostetler [matheus.bernardino: move header comment to ca() variant and reword msg] Signed-off-by: Matheus Tavares --- convert.c | 28 +++++++++++++++++----------- convert.h | 2 ++ 2 files changed, 19 insertions(+), 11 deletions(-) diff --git a/convert.c b/convert.c index 55bcce891c..c112ea23cb 100644 --- a/convert.c +++ b/convert.c @@ -1960,34 +1960,31 @@ static struct stream_filter *ident_filter(const struct object_id *oid) } /* - * Return an appropriately constructed filter for the path, or NULL if + * Return an appropriately constructed filter for the given ca, or NULL if * the contents cannot be filtered without reading the whole thing * in-core. * * Note that you would be crazy to set CRLF, smudge/clean or ident to a * large binary blob you would want us not to slurp into the memory! */ -struct stream_filter *get_stream_filter(const struct index_state *istate, - const char *path, - const struct object_id *oid) +struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca, + const struct object_id *oid) { - struct conv_attrs ca; struct stream_filter *filter = NULL; - convert_attrs(istate, &ca, path); - if (ca.drv && (ca.drv->process || ca.drv->smudge || ca.drv->clean)) + if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean)) return NULL; - if (ca.working_tree_encoding) + if (ca->working_tree_encoding) return NULL; - if (ca.crlf_action == CRLF_AUTO || ca.crlf_action == CRLF_AUTO_CRLF) + if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF) return NULL; - if (ca.ident) + if (ca->ident) filter = ident_filter(oid); - if (output_eol(ca.crlf_action) == EOL_CRLF) + if (output_eol(ca->crlf_action) == EOL_CRLF) filter = cascade_filter(filter, lf_to_crlf_filter()); else filter = cascade_filter(filter, &null_filter_singleton); @@ -1995,6 +1992,15 @@ struct stream_filter *get_stream_filter(const struct index_state *istate, return filter; } +struct stream_filter *get_stream_filter(const struct index_state *istate, + const char *path, + const struct object_id *oid) +{ + struct conv_attrs ca; + convert_attrs(istate, &ca, path); + return get_stream_filter_ca(&ca, oid); +} + void free_stream_filter(struct stream_filter *filter) { filter->vtbl->free(filter); diff --git a/convert.h b/convert.h index 46d537d1ae..262c1a1d46 100644 --- a/convert.h +++ b/convert.h @@ -169,6 +169,8 @@ struct stream_filter; /* opaque */ struct stream_filter *get_stream_filter(const struct index_state *istate, const char *path, const struct object_id *); +struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca, + const struct object_id *oid); void free_stream_filter(struct stream_filter *); int is_null_stream_filter(struct stream_filter *); From patchwork Tue Sep 22 22:49:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793505 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A80D59D for ; Tue, 22 Sep 2020 22:50:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E467E2076E for ; Tue, 22 Sep 2020 22:50:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="ya9JfvnH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726781AbgIVWuq (ORCPT ); Tue, 22 Sep 2020 18:50:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726637AbgIVWup (ORCPT ); Tue, 22 Sep 2020 18:50:45 -0400 Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A1C3C061755 for ; Tue, 22 Sep 2020 15:50:45 -0700 (PDT) Received: by mail-qk1-x72b.google.com with SMTP id g72so20906443qke.8 for ; Tue, 22 Sep 2020 15:50:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uuKuUiQojQh9pX7kscp9QIJOr0+2dep8P87vTaNLNlQ=; b=ya9JfvnHu+gwzIObreSPNVDOlMzbmH9x1bf11qnzj6kyMVHbs0rw8q7rQW/M5nAkvH 7yNulzIm/L7bBJZpbjHPsnoLOFU+DSBueHJgPbRYAH8Nt9/QrU899q3aNfPQXL2P5JHW Gc31CZ8Xbg8p0kXlMde26/rv9Ha6lYzU+C+wys3Kfp4HB67z0SqxwGLDJrn7YQ8qJV+2 4BI1Vso/pod8eRq3sf9f6cNXPycE5TsThMonXLPk/UpCoN4yMuDGA4zAKy257EAYTiZK UHWbAJBiOh0FnrCo3S5iS6sJLmP+ZWoHfv8i0quETS6Ru+336y9xd2ZbSXswWszHjjnT GAzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uuKuUiQojQh9pX7kscp9QIJOr0+2dep8P87vTaNLNlQ=; b=tv5HknLTuN0Nm0dJNMH1A6MQd8E1MJpfp8Vd+1FPrSpIrUdXTg3sddIkibE3b+bwIe RPOgLBX/7CEQXzWgCmsBZsyg4tMTymXxLdsC9NFGOMhOG1a8MIB81zjyckYapRlYj6l9 HzWAj7owFLYwqVbgQ7ZTpmWnzdAKH1hg/rjQB6oasZk8JvNT8WUe8Z49Hs+wSzD3YW7w cL2AOAuClOw1plkgM1hRol6gy9rDDGu6liGZ8fw4JE3rt8DZrC4f3hKzuNSZ/gh87HIu ZJ0cKZep1EjBowlm4mBUx0eCHZBIDzkKjmydhgnZfMIBWRrLF2Z6R+bCJEtU/HOynbM6 1ZWw== X-Gm-Message-State: AOAM530eW0s8jaL5s4y2pQ2rSWUnxDyVkVsjjXdcP26N0a9su3sj7lMF eK3z8zVKXuD1vVH15RkuKE2oY7Wf+0EkIw== X-Google-Smtp-Source: ABdhPJxr7Jq9CdGDCKkNJlitXghYav3B5iMCoDeFo6R+Qv/cewD1xn1wLddIfHHEeB7nlSZVNozaiQ== X-Received: by 2002:a37:5144:: with SMTP id f65mr2509221qkb.351.1600815044433; Tue, 22 Sep 2020 15:50:44 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:43 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 04/19] convert: add conv_attrs classification Date: Tue, 22 Sep 2020 19:49:18 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler Create `enum conv_attrs_classification` to express the different ways that attributes are handled for a blob during checkout. This will be used in a later commit when deciding whether to add a file to the parallel or delayed queue during checkout. For now, we can also use it in get_stream_filter_ca() to simplify the function (as the classifying logic is the same). Signed-off-by: Jeff Hostetler [matheus.bernardino: use classification in get_stream_filter_ca()] Signed-off-by: Matheus Tavares --- convert.c | 26 +++++++++++++++++++------- convert.h | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 7 deletions(-) diff --git a/convert.c b/convert.c index c112ea23cb..633ad6976a 100644 --- a/convert.c +++ b/convert.c @@ -1972,13 +1972,7 @@ struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca, { struct stream_filter *filter = NULL; - if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean)) - return NULL; - - if (ca->working_tree_encoding) - return NULL; - - if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF) + if (classify_conv_attrs(ca) != CA_CLASS_STREAMABLE) return NULL; if (ca->ident) @@ -2034,3 +2028,21 @@ void clone_checkout_metadata(struct checkout_metadata *dst, if (blob) oidcpy(&dst->blob, blob); } + +enum conv_attrs_classification classify_conv_attrs(const struct conv_attrs *ca) +{ + if (ca->drv) { + if (ca->drv->process) + return CA_CLASS_INCORE_PROCESS; + if (ca->drv->smudge || ca->drv->clean) + return CA_CLASS_INCORE_FILTER; + } + + if (ca->working_tree_encoding) + return CA_CLASS_INCORE; + + if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF) + return CA_CLASS_INCORE; + + return CA_CLASS_STREAMABLE; +} diff --git a/convert.h b/convert.h index 262c1a1d46..523ba9b140 100644 --- a/convert.h +++ b/convert.h @@ -190,4 +190,37 @@ int stream_filter(struct stream_filter *, const char *input, size_t *isize_p, char *output, size_t *osize_p); +enum conv_attrs_classification { + /* + * The blob must be loaded into a buffer before it can be + * smudged. All smudging is done in-proc. + */ + CA_CLASS_INCORE, + + /* + * The blob must be loaded into a buffer, but uses a + * single-file driver filter, such as rot13. + */ + CA_CLASS_INCORE_FILTER, + + /* + * The blob must be loaded into a buffer, but uses a + * long-running driver process, such as LFS. This might or + * might not use delayed operations. (The important thing is + * that there is a single subordinate long-running process + * handling all associated blobs and in case of delayed + * operations, may hold per-blob state.) + */ + CA_CLASS_INCORE_PROCESS, + + /* + * The blob can be streamed and smudged without needing to + * completely read it into a buffer. + */ + CA_CLASS_STREAMABLE, +}; + +enum conv_attrs_classification classify_conv_attrs( + const struct conv_attrs *ca); + #endif /* CONVERT_H */ From patchwork Tue Sep 22 22:49:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793507 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 624FD59D for ; Tue, 22 Sep 2020 22:50:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3B0122076E for ; Tue, 22 Sep 2020 22:50:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="G+4tElQy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726787AbgIVWut (ORCPT ); Tue, 22 Sep 2020 18:50:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726637AbgIVWut (ORCPT ); Tue, 22 Sep 2020 18:50:49 -0400 Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com [IPv6:2607:f8b0:4864:20::841]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED1E4C061755 for ; Tue, 22 Sep 2020 15:50:48 -0700 (PDT) Received: by mail-qt1-x841.google.com with SMTP id k25so17062528qtu.4 for ; Tue, 22 Sep 2020 15:50:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=G8k64ZCfnLfypepuOf25OgupUF4UnPCEHqXyUVVuwAw=; b=G+4tElQygG8bF808U2ggOZdDrC1dcRvZ267WkIKIW1HVSpk+WEJlX6W40rgsc8VcWf zrToBk+JkJKeG0hJavJGXHTpgPXtyht+5et4JHHd1nBRrjP42ge0b8rWewoiOBZsRW9b Bjx/TNyU/D/ZK2ksA87UfDc3r5XNc6yTVj5eRf1gTRKvYgbCYka7zRcxCzV9A6c6SsOE fm9ZPpv6RGcfmfvELCCO0MNsp9yy91BqsiK2cSlFL/xnh37JU9hXkGL85yOXp+MzbH3Z SlNuDyK0HlgMv5lYphoxdhyti1kGlNLft5tNWQ145KTo5Lm8nikVokTXimhYB+byJ2Ez pn1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=G8k64ZCfnLfypepuOf25OgupUF4UnPCEHqXyUVVuwAw=; b=oYxLmUHzZi7AMT+ugGLF4MhJa8Ku6eCVXyIxbNFDYTPTJ7wyyWK5izf2RwsHLfCYYJ uMiTcFzB8ZKm6fR1TmYzghHlGvqMw/9uus/uzf/VklblVNpUDI2rvErc0BehmPeyvjkz c++DiWPt84hXkgtr2CCg7HRgFvUL8xQoiEWF+1Ynr/H/tzNFXbpyooIGK16DoNb82tpa q5TucuKonSwM4GUrSNyQwiX8d2Tp34TCv6KP5oExdfX/CVMLBYUPcG0hyb1pUzc4PQZ2 CgZam6kvymuHy3QAirdJm4nlpOZ/H9MMOH86qXo92nn134i/YhtL6b4fbsdJamh5ORgl M8Cw== X-Gm-Message-State: AOAM5339HmjCE+4YfuOagSAlUuSWhy5eSge7N07IqkvRDD0vuJ9epCO1 TgvG3xEx38C0WRE1Z1J/HzbyKTQF8IDiJw== X-Google-Smtp-Source: ABdhPJy9tg3GD27MMvuisPvP2c1rzjIPDS4ZEtCdosW68zMFYxYJESfYWZjF98gstPGvnLIdKaL4IA== X-Received: by 2002:ac8:1b92:: with SMTP id z18mr6940413qtj.265.1600815047737; Tue, 22 Sep 2020 15:50:47 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:47 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 05/19] entry: extract a header file for entry.c functions Date: Tue, 22 Sep 2020 19:49:19 -0300 Message-Id: <25b311745aac7aaf43335a754ac8af3b79b65404.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The declarations of entry.c's public functions and structures currently reside in cache.h. Although not many, they contribute to the size of cache.h and, when changed, cause the unnecessary recompilation of modules that don't really use these functions. So let's move them to a new entry.h header. Original-patch-by: Nguyễn Thái Ngọc Duy Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Matheus Tavares --- apply.c | 1 + builtin/checkout-index.c | 1 + builtin/checkout.c | 1 + builtin/difftool.c | 1 + cache.h | 24 ----------------------- entry.c | 9 +-------- entry.h | 41 ++++++++++++++++++++++++++++++++++++++++ unpack-trees.c | 1 + 8 files changed, 47 insertions(+), 32 deletions(-) create mode 100644 entry.h diff --git a/apply.c b/apply.c index 76dba93c97..ddec80b4b0 100644 --- a/apply.c +++ b/apply.c @@ -21,6 +21,7 @@ #include "quote.h" #include "rerere.h" #include "apply.h" +#include "entry.h" struct gitdiff_data { struct strbuf *root; diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c index a854fd16e7..0f1ff73129 100644 --- a/builtin/checkout-index.c +++ b/builtin/checkout-index.c @@ -11,6 +11,7 @@ #include "quote.h" #include "cache-tree.h" #include "parse-options.h" +#include "entry.h" #define CHECKOUT_ALL 4 static int nul_term_line; diff --git a/builtin/checkout.c b/builtin/checkout.c index 0951f8fee5..b18b9d6f3c 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -26,6 +26,7 @@ #include "unpack-trees.h" #include "wt-status.h" #include "xdiff-interface.h" +#include "entry.h" static const char * const checkout_usage[] = { N_("git checkout [] "), diff --git a/builtin/difftool.c b/builtin/difftool.c index 7ac432b881..dfa22b67eb 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -23,6 +23,7 @@ #include "lockfile.h" #include "object-store.h" #include "dir.h" +#include "entry.h" static int trust_exit_code; diff --git a/cache.h b/cache.h index cee8aa5dc3..17350cafa2 100644 --- a/cache.h +++ b/cache.h @@ -1706,30 +1706,6 @@ const char *show_ident_date(const struct ident_split *id, */ int ident_cmp(const struct ident_split *, const struct ident_split *); -struct checkout { - struct index_state *istate; - const char *base_dir; - int base_dir_len; - struct delayed_checkout *delayed_checkout; - struct checkout_metadata meta; - unsigned force:1, - quiet:1, - not_new:1, - clone:1, - refresh_cache:1; -}; -#define CHECKOUT_INIT { NULL, "" } - -#define TEMPORARY_FILENAME_LENGTH 25 -int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath, int *nr_checkouts); -void enable_delayed_checkout(struct checkout *state); -int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); -/* - * Unlink the last component and schedule the leading directories for - * removal, such that empty directories get removed. - */ -void unlink_entry(const struct cache_entry *ce); - struct cache_def { struct strbuf path; int flags; diff --git a/entry.c b/entry.c index a0532f1f00..b0b8099699 100644 --- a/entry.c +++ b/entry.c @@ -6,6 +6,7 @@ #include "submodule.h" #include "progress.h" #include "fsmonitor.h" +#include "entry.h" static void create_directories(const char *path, int path_len, const struct checkout *state) @@ -429,14 +430,6 @@ static void mark_colliding_entries(const struct checkout *state, } } -/* - * Write the contents from ce out to the working tree. - * - * When topath[] is not NULL, instead of writing to the working tree - * file named by ce, a temporary file is created by this function and - * its name is returned in topath[], which must be able to hold at - * least TEMPORARY_FILENAME_LENGTH bytes long. - */ int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath, int *nr_checkouts) { diff --git a/entry.h b/entry.h new file mode 100644 index 0000000000..2d69185448 --- /dev/null +++ b/entry.h @@ -0,0 +1,41 @@ +#ifndef ENTRY_H +#define ENTRY_H + +#include "cache.h" +#include "convert.h" + +struct checkout { + struct index_state *istate; + const char *base_dir; + int base_dir_len; + struct delayed_checkout *delayed_checkout; + struct checkout_metadata meta; + unsigned force:1, + quiet:1, + not_new:1, + clone:1, + refresh_cache:1; +}; +#define CHECKOUT_INIT { NULL, "" } + +#define TEMPORARY_FILENAME_LENGTH 25 + +/* + * Write the contents from ce out to the working tree. + * + * When topath[] is not NULL, instead of writing to the working tree + * file named by ce, a temporary file is created by this function and + * its name is returned in topath[], which must be able to hold at + * least TEMPORARY_FILENAME_LENGTH bytes long. + */ +int checkout_entry(struct cache_entry *ce, const struct checkout *state, + char *topath, int *nr_checkouts); +void enable_delayed_checkout(struct checkout *state); +int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); +/* + * Unlink the last component and schedule the leading directories for + * removal, such that empty directories get removed. + */ +void unlink_entry(const struct cache_entry *ce); + +#endif /* ENTRY_H */ diff --git a/unpack-trees.c b/unpack-trees.c index 323280dd48..a511fadd89 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -16,6 +16,7 @@ #include "fsmonitor.h" #include "object-store.h" #include "promisor-remote.h" +#include "entry.h" /* * Error messages expected by scripts out of plumbing commands such as From patchwork Tue Sep 22 22:49:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793509 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6638C139A for ; Tue, 22 Sep 2020 22:50:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 44C54206A5 for ; Tue, 22 Sep 2020 22:50:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="Cucnxk/A" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726792AbgIVWuw (ORCPT ); Tue, 22 Sep 2020 18:50:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726637AbgIVWuw (ORCPT ); Tue, 22 Sep 2020 18:50:52 -0400 Received: from mail-qv1-xf44.google.com (mail-qv1-xf44.google.com [IPv6:2607:f8b0:4864:20::f44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDF97C061755 for ; Tue, 22 Sep 2020 15:50:51 -0700 (PDT) Received: by mail-qv1-xf44.google.com with SMTP id di5so10395741qvb.13 for ; Tue, 22 Sep 2020 15:50:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pnjvx5nwJkdEA5uoqvN7tjpCjIhkVeUGZe2ysSBLLsA=; b=Cucnxk/AblIz7aAOQpsjuhBqoibRc2y8xkZK4zHnBMl8JeqArIotgS1+W9/T+G+v5W DyeKnuxUCAS+8YT/F41uIdssW2kKlI9Ku0bO0mEQ4pwp3LAON+3mj4GrRQ5dvO+ZrQdn iXi4TNq/duy/TSlH1zRrfhOesVfr97uRF2OuEkbJGqY1/TjZn0uQXv3lyPYWVzq7aBG4 OQ+4IkiTByGH1GzzqifEYRgmDkvVJuuOnmSnDXMI5LlGKTp3buR54XNSMwceyfesLkZA rd+uLiWpV2VPjqJMgNAC/rHW7FhL8/umurVnjY+/G6yjG9n5L4oWNL2VJ4rAyisBRzrt 28gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pnjvx5nwJkdEA5uoqvN7tjpCjIhkVeUGZe2ysSBLLsA=; b=JxmdMD0GEV+EPlafMmFKMKMOZ8q4DffVi11N6D9eot42Tn8oPAlUusMPxQC/dwUD9c HJoF4vJp3bQv2TWyzXSV2EDuH58xO4XrxL5LWxxUPkXhPD5yMc4dNKqzupJzVHUxE90Z ZdQevB6vQRfWlfIPuDG82xPfn639CCH0y1GaMFUrluKDljgYovQyKQaUPAZiA8NVA0L/ dkrvfX1B7aLvriQGDznoKETXboZaelr/Xqiy4tr2ewQeFi4yt9r4buVFR23606hzMttI poqzIB85vT+a7g/RVNea8EhtB/6F9WYPJWyH8Sg/UHoU7rk/+CubdU/3WSTCcDcigxDL 2GyQ== X-Gm-Message-State: AOAM531DC10gB2tLmhaiUE5LaOqi1WVt5iUtPK0BfaBf8mYcSvOv3hTf pKuSYsn2nADHbzHPRGidaWkCMnp8yIJcjA== X-Google-Smtp-Source: ABdhPJw0jJsxKInqwFtMhvwnV7oFSTYoJ6QWGLNyS/VscyG4OOdrm4KzBt34R0YfvTxiIFr5099UuA== X-Received: by 2002:ad4:544a:: with SMTP id h10mr8450449qvt.35.1600815050824; Tue, 22 Sep 2020 15:50:50 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:50 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 06/19] entry: make fstat_output() and read_blob_entry() public Date: Tue, 22 Sep 2020 19:49:20 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org These two functions will be used by the parallel checkout code, so let's make them public. Note: fstat_output() is renamed to fstat_checkout_output(), now that it has become public, seeking to avoid future name collisions. Signed-off-by: Matheus Tavares --- entry.c | 8 ++++---- entry.h | 2 ++ 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/entry.c b/entry.c index b0b8099699..b36071a610 100644 --- a/entry.c +++ b/entry.c @@ -84,7 +84,7 @@ static int create_file(const char *path, unsigned int mode) return open(path, O_WRONLY | O_CREAT | O_EXCL, mode); } -static void *read_blob_entry(const struct cache_entry *ce, unsigned long *size) +void *read_blob_entry(const struct cache_entry *ce, unsigned long *size) { enum object_type type; void *blob_data = read_object_file(&ce->oid, &type, size); @@ -109,7 +109,7 @@ static int open_output_fd(char *path, const struct cache_entry *ce, int to_tempf } } -static int fstat_output(int fd, const struct checkout *state, struct stat *st) +int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st) { /* use fstat() only when path == ce->name */ if (fstat_is_reliable() && @@ -132,7 +132,7 @@ static int streaming_write_entry(const struct cache_entry *ce, char *path, return -1; result |= stream_blob_to_fd(fd, &ce->oid, filter, 1); - *fstat_done = fstat_output(fd, state, statbuf); + *fstat_done = fstat_checkout_output(fd, state, statbuf); result |= close(fd); if (result) @@ -346,7 +346,7 @@ static int write_entry(struct cache_entry *ce, wrote = write_in_full(fd, new_blob, size); if (!to_tempfile) - fstat_done = fstat_output(fd, state, &st); + fstat_done = fstat_checkout_output(fd, state, &st); close(fd); free(new_blob); if (wrote < 0) diff --git a/entry.h b/entry.h index 2d69185448..f860e60846 100644 --- a/entry.h +++ b/entry.h @@ -37,5 +37,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); * removal, such that empty directories get removed. */ void unlink_entry(const struct cache_entry *ce); +void *read_blob_entry(const struct cache_entry *ce, unsigned long *size); +int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st); #endif /* ENTRY_H */ From patchwork Tue Sep 22 22:49:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793511 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 865FB59D for ; Tue, 22 Sep 2020 22:50:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6BCC52076E for ; Tue, 22 Sep 2020 22:50:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="DSWYjfsM" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726801AbgIVWuz (ORCPT ); Tue, 22 Sep 2020 18:50:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726794AbgIVWuz (ORCPT ); Tue, 22 Sep 2020 18:50:55 -0400 Received: from mail-qk1-x742.google.com (mail-qk1-x742.google.com [IPv6:2607:f8b0:4864:20::742]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2754DC061755 for ; Tue, 22 Sep 2020 15:50:55 -0700 (PDT) Received: by mail-qk1-x742.google.com with SMTP id c62so7738256qke.1 for ; Tue, 22 Sep 2020 15:50:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nN6bJrj4N/km2e+ROlSwUAEO25dq+vCA518kUob9Gws=; b=DSWYjfsM1pMKxC/dfA4rKmfvnWF3WO35lufTiVPQPvTU1l7HeP5zzDNCLKqmJiarOV 3WzD32JoQbDdiGk5cg/DnwLoIAfzkd4mgV78mmiddALLk/UcvJHgxl4G1TwwpfCC9uqK k+aTHe5kCgtWZ4rLuHKyz1Oes4lMhtKCJQY1NxMcjFLMvghDZRlHDqbj+h5x+9C+eRWw We7jDFwlmp/OQAv1lo7KkgsT8F0un2n+DYRgCkPpy+zf6bZ9tfQIt5Xja//yj85oo0oP SR4zqiOQtEJ+nGnJI5RvXpKHEiR0pTH4K2eECrxzvoQgp8aaeeuNHXXrHhYVxtuIUs5l kmVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nN6bJrj4N/km2e+ROlSwUAEO25dq+vCA518kUob9Gws=; b=tPKWmSrvG41WfysvgRgMQm/qL9W0C1oIXV9zHwtSUx3XJkB76qQvhfZbEquz5LZch5 QMikBtul+aZAfSWYjYAnnwp0FPECuTqjhqn0H/vrVmn2oR0HAZis032tPzruuq+xP6Rq W2OHyNHX6tB0rVZVNVlWkFoxoFNtWYiUv4iqeAbtCgfVr72OkhJuhrRRmD4OlmMK8WOm Ju2tS+RDJ5hc7ZMDulz33CfStzDN8aCCYqqQN1Njmwu00fjJyA4uuHS4qdet0bH13yui kVcFY+eUXFMgMIGNzo6g4P6R4tKwxmogkMLhycb3nD7/yF/qCHsvW1jUvYrZEfZoM34W At4Q== X-Gm-Message-State: AOAM530f8KRX2uP7HRLFpK5vitZRLGSA+o35KhXGo95vBzOhvNWJbQDb xNXLo0eYYTNDPMtxWWHPxzcWGkoJFE6apw== X-Google-Smtp-Source: ABdhPJzUeBHdRPpZbtwIDOnsnC1rh7iMMggl0THUMSRYnqxANmhYffLDzRLdSoEQyMPJmrMS+kD0bQ== X-Received: by 2002:a05:620a:a09:: with SMTP id i9mr6593058qka.201.1600815054033; Tue, 22 Sep 2020 15:50:54 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:53 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 07/19] entry: extract cache_entry update from write_entry() Date: Tue, 22 Sep 2020 19:49:21 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This code will be used by the parallel checkout functions, outside entry.c, so extract it to a public function. Signed-off-by: Matheus Tavares --- entry.c | 25 ++++++++++++++++--------- entry.h | 2 ++ 2 files changed, 18 insertions(+), 9 deletions(-) diff --git a/entry.c b/entry.c index b36071a610..1d2df188e5 100644 --- a/entry.c +++ b/entry.c @@ -251,6 +251,18 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts) return errs; } +void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, + struct stat *st) +{ + if (state->refresh_cache) { + assert(state->istate); + fill_stat_cache_info(state->istate, ce, st); + ce->ce_flags |= CE_UPDATE_IN_BASE; + mark_fsmonitor_invalid(state->istate, ce); + state->istate->cache_changed |= CE_ENTRY_CHANGED; + } +} + static int write_entry(struct cache_entry *ce, char *path, const struct checkout *state, int to_tempfile) { @@ -371,15 +383,10 @@ static int write_entry(struct cache_entry *ce, finish: if (state->refresh_cache) { - assert(state->istate); - if (!fstat_done) - if (lstat(ce->name, &st) < 0) - return error_errno("unable to stat just-written file %s", - ce->name); - fill_stat_cache_info(state->istate, ce, &st); - ce->ce_flags |= CE_UPDATE_IN_BASE; - mark_fsmonitor_invalid(state->istate, ce); - state->istate->cache_changed |= CE_ENTRY_CHANGED; + if (!fstat_done && lstat(ce->name, &st) < 0) + return error_errno("unable to stat just-written file %s", + ce->name); + update_ce_after_write(state, ce , &st); } delayed: return 0; diff --git a/entry.h b/entry.h index f860e60846..664aed1576 100644 --- a/entry.h +++ b/entry.h @@ -39,5 +39,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); void unlink_entry(const struct cache_entry *ce); void *read_blob_entry(const struct cache_entry *ce, unsigned long *size); int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st); +void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, + struct stat *st); #endif /* ENTRY_H */ From patchwork Tue Sep 22 22:49:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793513 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 83B4B139A for ; Tue, 22 Sep 2020 22:50:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 684D42076E for ; Tue, 22 Sep 2020 22:50:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="Vc1D9RGY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726807AbgIVWu6 (ORCPT ); Tue, 22 Sep 2020 18:50:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726794AbgIVWu6 (ORCPT ); Tue, 22 Sep 2020 18:50:58 -0400 Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 345B8C061755 for ; Tue, 22 Sep 2020 15:50:58 -0700 (PDT) Received: by mail-qk1-x741.google.com with SMTP id q63so20913665qkf.3 for ; Tue, 22 Sep 2020 15:50:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=d0o9M053aS4Fpu58NIsI3IZY8vBtu9bdnRbtLyCPxzA=; b=Vc1D9RGYhwhVAFKQ/f5o1bqAkji7S/Bxr2U1wHbK9hbIRD+iRmrkCIm8qspIdc7hgC ua2j0SDWQD6SIQStepA6Oo6k7Afn2VkHEfW01YaESbPwsS+FIUTAHUHZyZO+5LMOu7uE IYmlD4MBBjknBBKfeQYilV8p+j6kTsHx8Q1B0V1lEkYI7Wk6kqlWRg67bIq4pS4LSfwM CWW/0ND4wrmMnGzOT2wPVXniG5S+Xw5E+OqQPbxlFEnw0nXxwXbTEsIISIoqr8BpLDKp oR3e4UrfMWjy7wY5HY1XTkQ7DkK7RzQArqflGOUU48t/d73XwymK593hh8dGn8YQyPBQ 6qrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=d0o9M053aS4Fpu58NIsI3IZY8vBtu9bdnRbtLyCPxzA=; b=Dn2OQ2TPokRMaoZgI6F1Fag8HwQO7w42shIZUSOct0sAGGI1q10Yc2oJMIojXqqhg5 GYz/q7Td21qYKC5zoANfFk08jTkLWOnlTv69N7ck6Q18xtGXiM9suokkqS+ZHrtibzaT Rb6WoTEBDgOChjoGpNBkMIdlC40ZjyDSrl2I0n3GPcG+e/BgCzSQ/PlY0ollgaHNE3oZ N3ulz2k2R9cm/IORNKShdoEEnU2hxxeizrcALBlJRUinTIcRAAoToPyyoBHq5JSO2bIV WuQndNRNZX6//NVqsjV85vbNkdIMVr9cjn8oiLLMG/x7O2I5n4TSX3wt7UOdrPHOnq6y FWuA== X-Gm-Message-State: AOAM5313p4e4r85d3NZvE6j5+rkRHyRgXO66ORSrqRFXqflEhhdmalCZ BZsQfu/K6/EKupqGcZRortvsnPdog4OJtQ== X-Google-Smtp-Source: ABdhPJxzFA2aBOGkNj1ZOYGt9XH8SlHtCgVa76Arf45DCY+yiq8qb6VfkhPYL81/xE3+hHvjdZY/5Q== X-Received: by 2002:a37:9d4d:: with SMTP id g74mr6936657qke.422.1600815056920; Tue, 22 Sep 2020 15:50:56 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:56 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry() Date: Tue, 22 Sep 2020 19:49:22 -0300 Message-Id: <667ad0dea70cb7f0bbf8f52467f15129b3ae1325.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a following patch, checkout_entry() will use conv_attrs to decide whether an entry should be enqueued for parallel checkout or not. But the attributes lookup only happens lower in this call stack. To avoid the unnecessary work of loading the attributes twice, let's move it up to checkout_entry(), and pass the loaded struct down to write_entry(). Signed-off-by: Matheus Tavares --- entry.c | 38 +++++++++++++++++++++++++++----------- 1 file changed, 27 insertions(+), 11 deletions(-) diff --git a/entry.c b/entry.c index 1d2df188e5..8237859b12 100644 --- a/entry.c +++ b/entry.c @@ -263,8 +263,9 @@ void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, } } -static int write_entry(struct cache_entry *ce, - char *path, const struct checkout *state, int to_tempfile) +/* Note: ca is used (and required) iff the entry refers to a regular file. */ +static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca, + const struct checkout *state, int to_tempfile) { unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT; struct delayed_checkout *dco = state->delayed_checkout; @@ -281,8 +282,7 @@ static int write_entry(struct cache_entry *ce, clone_checkout_metadata(&meta, &state->meta, &ce->oid); if (ce_mode_s_ifmt == S_IFREG) { - struct stream_filter *filter = get_stream_filter(state->istate, ce->name, - &ce->oid); + struct stream_filter *filter = get_stream_filter_ca(ca, &ce->oid); if (filter && !streaming_write_entry(ce, path, filter, state, to_tempfile, @@ -329,14 +329,17 @@ static int write_entry(struct cache_entry *ce, * Convert from git internal format to working tree format */ if (dco && dco->state != CE_NO_DELAY) { - ret = async_convert_to_working_tree(state->istate, ce->name, new_blob, - size, &buf, &meta, dco); + ret = async_convert_to_working_tree_ca(ca, ce->name, + new_blob, size, + &buf, &meta, dco); if (ret && string_list_has_string(&dco->paths, ce->name)) { free(new_blob); goto delayed; } - } else - ret = convert_to_working_tree(state->istate, ce->name, new_blob, size, &buf, &meta); + } else { + ret = convert_to_working_tree_ca(ca, ce->name, new_blob, + size, &buf, &meta); + } if (ret) { free(new_blob); @@ -442,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, { static struct strbuf path = STRBUF_INIT; struct stat st; + struct conv_attrs ca; if (ce->ce_flags & CE_WT_REMOVE) { if (topath) @@ -454,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, return 0; } - if (topath) - return write_entry(ce, topath, state, 1); + if (topath) { + if (S_ISREG(ce->ce_mode)) { + convert_attrs(state->istate, &ca, ce->name); + return write_entry(ce, topath, &ca, state, 1); + } + return write_entry(ce, topath, NULL, state, 1); + } strbuf_reset(&path); strbuf_add(&path, state->base_dir, state->base_dir_len); @@ -517,9 +526,16 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, return 0; create_directories(path.buf, path.len, state); + if (nr_checkouts) (*nr_checkouts)++; - return write_entry(ce, path.buf, state, 0); + + if (S_ISREG(ce->ce_mode)) { + convert_attrs(state->istate, &ca, ce->name); + return write_entry(ce, path.buf, &ca, state, 0); + } + + return write_entry(ce, path.buf, NULL, state, 0); } void unlink_entry(const struct cache_entry *ce) From patchwork Tue Sep 22 22:49:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793515 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87195139A for ; Tue, 22 Sep 2020 22:51:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6BE0C2071A for ; Tue, 22 Sep 2020 22:51:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="SAjtBYHv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726810AbgIVWvB (ORCPT ); Tue, 22 Sep 2020 18:51:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726794AbgIVWvB (ORCPT ); Tue, 22 Sep 2020 18:51:01 -0400 Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com [IPv6:2607:f8b0:4864:20::841]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C3E2C061755 for ; Tue, 22 Sep 2020 15:51:01 -0700 (PDT) Received: by mail-qt1-x841.google.com with SMTP id 19so17091302qtp.1 for ; Tue, 22 Sep 2020 15:51:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XYT4R9ALwv2Rpt7TK8K0S9uE1PRUq0gDUVR/1tkQK5U=; b=SAjtBYHvdWRHACIYnBSV8Rh0loxH49AmHs4/3su42BAil9p3EoPHYcg8V/4jj38isN +bMD90h4PR5DG1fu2XPuLqxmjTsGfYNzLzbnNzwGNIc3YUZDQiQwgtx3tL0V0dl0eycj UqkvpuAsUh8c0ySGSRO/DaPZkbJWkH8aqCy+ox6JlYyzAeEDl8popDwHPuMwZbR2SBU0 ybD3e39pY+REPxYe5tk2/OAl5UIWrTXWmpk3TsTHK3ssWb342ARCCszA2lqU0M/2GFQ6 4RF0+zj1voo/GELR7G9RZOBPh6Brpl4LjGHZ6gs4b/D55IQXdN1GBNiev35fDI42I+Hi n+Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XYT4R9ALwv2Rpt7TK8K0S9uE1PRUq0gDUVR/1tkQK5U=; b=AgJI80868aq5wj3hPmurwh3/atl9+SKrDNDlHYUHBZD2PYzjf2ypqNfhMB5RF7KaCf ya5cpJ4GcO5U2G/Vtg+8z2MckFPBEt1LTBiK11HQnwaeDJ3Y8PXYhniYTY5pgBKSJR/i DnqHnPrrb8FA7ZJFOwnzv8FnOWcTx4MUHGytZE+yCxY/alz4ieadzEPbaPuMdFK3hv6j zynCP1/RUC6SFTF2tCm8lfKL3Du6pb2CxjMTtyguOSj6TKZS//W3Ype7J7s6oHYqPxl2 scRhrP5bmGG4ZhIIElbpuiIkurD9HoiI9O6ehJF/F5jt/hs211ioPltCfFUwFsMXog2y g4yQ== X-Gm-Message-State: AOAM531DbuQ5rEXZN3AEspBTlRg3HI7AdPZqUV60xBu1rFeYTYZdBU6m Uj6+wIV6VRL/pBJQpUEpQB38rjc02LKurA== X-Google-Smtp-Source: ABdhPJw3/JfAhZXhvVa+j4ZmgTKhGkDzSMnb+Ut064xJoHoJ8R04pqJaHB0rZTmlLFPDzn3KNccebg== X-Received: by 2002:ac8:1a08:: with SMTP id v8mr7256223qtj.353.1600815059853; Tue, 22 Sep 2020 15:50:59 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:59 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Date: Tue, 22 Sep 2020 19:49:23 -0300 Message-Id: <4ddb34209e9340f4e709234262a4a9ce81ad9b51.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The parallel checkout machinery will call checkout_entry() for entries that could not be written in parallel due to path collisions. At this point, we will already be holding the conversion attributes for each entry, and it would be wasteful to let checkout_entry() load these again. Instead, let's add the checkout_entry_ca() variant, which optionally takes a preloaded conv_attrs struct. Signed-off-by: Matheus Tavares --- entry.c | 23 ++++++++++++----------- entry.h | 13 +++++++++++-- 2 files changed, 23 insertions(+), 13 deletions(-) diff --git a/entry.c b/entry.c index 8237859b12..9d79a5671f 100644 --- a/entry.c +++ b/entry.c @@ -440,12 +440,13 @@ static void mark_colliding_entries(const struct checkout *state, } } -int checkout_entry(struct cache_entry *ce, const struct checkout *state, - char *topath, int *nr_checkouts) +int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca, + const struct checkout *state, char *topath, + int *nr_checkouts) { static struct strbuf path = STRBUF_INIT; struct stat st; - struct conv_attrs ca; + struct conv_attrs ca_buf; if (ce->ce_flags & CE_WT_REMOVE) { if (topath) @@ -459,11 +460,11 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, } if (topath) { - if (S_ISREG(ce->ce_mode)) { - convert_attrs(state->istate, &ca, ce->name); - return write_entry(ce, topath, &ca, state, 1); + if (S_ISREG(ce->ce_mode) && !ca) { + convert_attrs(state->istate, &ca_buf, ce->name); + ca = &ca_buf; } - return write_entry(ce, topath, NULL, state, 1); + return write_entry(ce, topath, ca, state, 1); } strbuf_reset(&path); @@ -530,12 +531,12 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, if (nr_checkouts) (*nr_checkouts)++; - if (S_ISREG(ce->ce_mode)) { - convert_attrs(state->istate, &ca, ce->name); - return write_entry(ce, path.buf, &ca, state, 0); + if (S_ISREG(ce->ce_mode) && !ca) { + convert_attrs(state->istate, &ca_buf, ce->name); + ca = &ca_buf; } - return write_entry(ce, path.buf, NULL, state, 0); + return write_entry(ce, path.buf, ca, state, 0); } void unlink_entry(const struct cache_entry *ce) diff --git a/entry.h b/entry.h index 664aed1576..2081fbbbab 100644 --- a/entry.h +++ b/entry.h @@ -27,9 +27,18 @@ struct checkout { * file named by ce, a temporary file is created by this function and * its name is returned in topath[], which must be able to hold at * least TEMPORARY_FILENAME_LENGTH bytes long. + * + * With checkout_entry_ca(), callers can optionally pass a preloaded + * conv_attrs struct (to avoid reloading it), when ce refers to a + * regular file. If ca is NULL, the attributes will be loaded + * internally when (and if) needed. */ -int checkout_entry(struct cache_entry *ce, const struct checkout *state, - char *topath, int *nr_checkouts); +#define checkout_entry(ce, state, topath, nr_checkouts) \ + checkout_entry_ca(ce, NULL, state, topath, nr_checkouts) +int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca, + const struct checkout *state, char *topath, + int *nr_checkouts); + void enable_delayed_checkout(struct checkout *state); int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); /* From patchwork Tue Sep 22 22:49:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793517 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 771A659D for ; Tue, 22 Sep 2020 22:51:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4C6512076E for ; Tue, 22 Sep 2020 22:51:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="SD9fu47s" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726817AbgIVWvF (ORCPT ); Tue, 22 Sep 2020 18:51:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726794AbgIVWvE (ORCPT ); Tue, 22 Sep 2020 18:51:04 -0400 Received: from mail-qt1-x844.google.com (mail-qt1-x844.google.com [IPv6:2607:f8b0:4864:20::844]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8847FC061755 for ; Tue, 22 Sep 2020 15:51:04 -0700 (PDT) Received: by mail-qt1-x844.google.com with SMTP id v54so17048487qtj.7 for ; Tue, 22 Sep 2020 15:51:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Knst5igmk+hv7j4VT1wucLKj+2gCA6o9aneFlhHiWT4=; b=SD9fu47svj/MPBW5HZ0YTSrZkOJNfqWTdnMVdUpwL8FolAbGZx567PR3MPi+5qe8wG j1TSbGNIw57R/biwhV9lK17iEgwF5tsGiyHzgwFPlrIxN201BPd7/OI81aeGEF+7d/tU da3f5oEPqaSO+Y93NLq4vzhBp0CEBiIykxVaKD5Jyk8wJ6Uefc7BxeHJWUHDB/hXPLz3 BEHGbq894QAt1Wz+dlBe9Q70+fsnHk6hzsZf6/A9eaRA+B8rHCxs/d6mkG6poNdFtEca m837EYn+bhQtZKJxyWrFz8iuwwgssrHwaoFY/UrLC+1asJoomqkHn2aFIkCmTI//ums5 oYCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Knst5igmk+hv7j4VT1wucLKj+2gCA6o9aneFlhHiWT4=; b=cSY9AJNCQSLKmlkCNP/YCj3yTzL1LP+Cg2VrVV6y6ifZeI5ru7Yq40iRvDLQxG3Jmz s2g8N5sRF0LZLcYrW1Jvs2Z5JAixpU1keQxxuHq3/ObW/0GksTcKhut+MzhsBe8Aenz2 RFRJpzlAJ5nWnP80qx7uEDrizRfobQd7mQDhIAuj056XuL5hn/kVv3iJgad7OSGjUnLB z28rzeJUPauqB1YKpvURs3vMNWyIXfCr1Wd2uBP0hjKKSXl97rm3bbycFOHy46+4lSHh kWcqULADpnOW+QQFx7H2d4889jWHVbfTkUevrbyRUvwrhmnnW8H+lYfJawi4xkY+O08x nt8g== X-Gm-Message-State: AOAM530AVFP4Jnzzw9EosshBMbDRmLCFjfRh2+jghk+pPozR+OpUNugc FGjGnKZ/g/091yUz+YO+X5vWNrOV73SYcQ== X-Google-Smtp-Source: ABdhPJzLbBiNtzDxvVRBsX6KF3bnwWh19Fc0JIysEeuHvKBw2u4at5Oz9+0HRqhZxXjvwhYz+o/jow== X-Received: by 2002:ac8:5517:: with SMTP id j23mr7205483qtq.47.1600815062879; Tue, 22 Sep 2020 15:51:02 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:02 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 10/19] unpack-trees: add basic support for parallel checkout Date: Tue, 22 Sep 2020 19:49:24 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This new interface allows us to enqueue some of the entries being checked out to later call write_entry() for them in parallel. For now, the parallel checkout machinery is enabled by default and there is no user configuration, but run_parallel_checkout() just writes the queued entries in sequence (without spawning additional workers). The next patch will actually implement the parallelism and, later, we will make it configurable. When there are path collisions among the entries being written (which can happen e.g. with case-sensitive files in case-insensitive file systems), the parallel checkout code detects the problem and marks the item with PC_ITEM_COLLIDED. Later, these items are sequentially fed to checkout_entry() again. This is similar to the way the sequential code deals with collisions, overwriting the previously checked out entries with the subsequent ones. The only difference is that, when we start writing the entries in parallel, we won't be able to determine which of the colliding entries will survive on disk (for the sequential algorithm, it is always the last one). I also experimented with the idea of not overwriting colliding entries, and it seemed to work well in my simple tests. However, because just one entry of each colliding group would be actually written, the others would have null lstat() fields on the index. This might not be a problem by itself, but it could cause performance penalties for subsequent commands that need to refresh the index: when the st_size value cached is 0, read-cache.c:ie_modified() will go to the filesystem to see if the contents match. As mentioned in the function: * Immediately after read-tree or update-index --cacheinfo, * the length field is zero, as we have never even read the * lstat(2) information once, and we cannot trust DATA_CHANGED * returned by ie_match_stat() which in turn was returned by * ce_match_stat_basic() to signal that the filesize of the * blob changed. We have to actually go to the filesystem to * see if the contents match, and if so, should answer "unchanged". So, if we have N entries in a colliding group and we decide to write and lstat() only one of them, every subsequent git-status will have to read, convert, and hash the written file N - 1 times, to check that the N - 1 unwritten entries are dirty. By checking out all colliding entries (like the sequential code does), we only pay the overhead once. Co-authored-by: Nguyễn Thái Ngọc Duy Co-authored-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- Note: currently, we have to check leading directories again before writing each parallel-eligible entry, as explained in the respective code comment. But I plan to remove this extra work on part II, postponing the checkout of symlinks to *after* the parallel-eligible entries. Makefile | 1 + entry.c | 17 +- parallel-checkout.c | 368 ++++++++++++++++++++++++++++++++++++++++++++ parallel-checkout.h | 27 ++++ unpack-trees.c | 6 +- 5 files changed, 416 insertions(+), 3 deletions(-) create mode 100644 parallel-checkout.c create mode 100644 parallel-checkout.h diff --git a/Makefile b/Makefile index f1b1bc8aa0..3edcdc534c 100644 --- a/Makefile +++ b/Makefile @@ -932,6 +932,7 @@ LIB_OBJS += pack-revindex.o LIB_OBJS += pack-write.o LIB_OBJS += packfile.o LIB_OBJS += pager.o +LIB_OBJS += parallel-checkout.o LIB_OBJS += parse-options-cb.o LIB_OBJS += parse-options.o LIB_OBJS += patch-delta.o diff --git a/entry.c b/entry.c index 9d79a5671f..6676954431 100644 --- a/entry.c +++ b/entry.c @@ -7,6 +7,7 @@ #include "progress.h" #include "fsmonitor.h" #include "entry.h" +#include "parallel-checkout.h" static void create_directories(const char *path, int path_len, const struct checkout *state) @@ -426,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state, for (i = 0; i < state->istate->cache_nr; i++) { struct cache_entry *dup = state->istate->cache[i]; - if (dup == ce) - break; + if (dup == ce) { + /* + * Parallel checkout creates the files in no particular + * order. So the other side of the collision may appear + * after the given cache_entry in the array. + */ + if (parallel_checkout_status() == PC_RUNNING) + continue; + else + break; + } if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE)) continue; @@ -536,6 +546,9 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca, ca = &ca_buf; } + if (!enqueue_checkout(ce, ca)) + return 0; + return write_entry(ce, path.buf, ca, state, 0); } diff --git a/parallel-checkout.c b/parallel-checkout.c new file mode 100644 index 0000000000..7dc8ab2a67 --- /dev/null +++ b/parallel-checkout.c @@ -0,0 +1,368 @@ +#include "cache.h" +#include "entry.h" +#include "parallel-checkout.h" +#include "streaming.h" + +enum pc_item_status { + PC_ITEM_PENDING = 0, + PC_ITEM_WRITTEN, + /* + * The entry could not be written because there was another file + * already present in its path or leading directories. Since + * checkout_entry_ca() removes such files from the working tree before + * enqueueing the entry for parallel checkout, it means that there was + * a path collision among the entries being written. + */ + PC_ITEM_COLLIDED, + PC_ITEM_FAILED, +}; + +struct parallel_checkout_item { + /* pointer to a istate->cache[] entry. Not owned by us. */ + struct cache_entry *ce; + struct conv_attrs ca; + struct stat st; + enum pc_item_status status; +}; + +struct parallel_checkout { + enum pc_status status; + struct parallel_checkout_item *items; + size_t nr, alloc; +}; + +static struct parallel_checkout parallel_checkout = { 0 }; + +enum pc_status parallel_checkout_status(void) +{ + return parallel_checkout.status; +} + +void init_parallel_checkout(void) +{ + if (parallel_checkout.status != PC_UNINITIALIZED) + BUG("parallel checkout already initialized"); + + parallel_checkout.status = PC_ACCEPTING_ENTRIES; +} + +static void finish_parallel_checkout(void) +{ + if (parallel_checkout.status == PC_UNINITIALIZED) + BUG("cannot finish parallel checkout: not initialized yet"); + + free(parallel_checkout.items); + memset(¶llel_checkout, 0, sizeof(parallel_checkout)); +} + +static int is_eligible_for_parallel_checkout(const struct cache_entry *ce, + const struct conv_attrs *ca) +{ + enum conv_attrs_classification c; + + if (!S_ISREG(ce->ce_mode)) + return 0; + + c = classify_conv_attrs(ca); + switch (c) { + case CA_CLASS_INCORE: + return 1; + + case CA_CLASS_INCORE_FILTER: + /* + * It would be safe to allow concurrent instances of + * single-file smudge filters, like rot13, but we should not + * assume that all filters are parallel-process safe. So we + * don't allow this. + */ + return 0; + + case CA_CLASS_INCORE_PROCESS: + /* + * The parallel queue and the delayed queue are not compatible, + * so they must be kept completely separated. And we can't tell + * if a long-running process will delay its response without + * actually asking it to perform the filtering. Therefore, this + * type of filter is not allowed in parallel checkout. + * + * Furthermore, there should only be one instance of the + * long-running process filter as we don't know how it is + * managing its own concurrency. So, spreading the entries that + * requisite such a filter among the parallel workers would + * require a lot more inter-process communication. We would + * probably have to designate a single process to interact with + * the filter and send all the necessary data to it, for each + * entry. + */ + return 0; + + case CA_CLASS_STREAMABLE: + return 1; + + default: + BUG("unsupported conv_attrs classification '%d'", c); + } +} + +int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca) +{ + struct parallel_checkout_item *pc_item; + + if (parallel_checkout.status != PC_ACCEPTING_ENTRIES || + !is_eligible_for_parallel_checkout(ce, ca)) + return -1; + + ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1, + parallel_checkout.alloc); + + pc_item = ¶llel_checkout.items[parallel_checkout.nr++]; + pc_item->ce = ce; + memcpy(&pc_item->ca, ca, sizeof(pc_item->ca)); + pc_item->status = PC_ITEM_PENDING; + + return 0; +} + +static int handle_results(struct checkout *state) +{ + int ret = 0; + size_t i; + int have_pending = 0; + + /* + * We first update the successfully written entries with the collected + * stat() data, so that they can be found by mark_colliding_entries(), + * in the next loop, when necessary. + */ + for (i = 0; i < parallel_checkout.nr; ++i) { + struct parallel_checkout_item *pc_item = ¶llel_checkout.items[i]; + if (pc_item->status == PC_ITEM_WRITTEN) + update_ce_after_write(state, pc_item->ce, &pc_item->st); + } + + for (i = 0; i < parallel_checkout.nr; ++i) { + struct parallel_checkout_item *pc_item = ¶llel_checkout.items[i]; + + switch(pc_item->status) { + case PC_ITEM_WRITTEN: + /* Already handled */ + break; + case PC_ITEM_COLLIDED: + /* + * The entry could not be checked out due to a path + * collision with another entry. Since there can only + * be one entry of each colliding group on the disk, we + * could skip trying to check out this one and move on. + * However, this would leave the unwritten entries with + * null stat() fields on the index, which could + * potentially slow down subsequent operations that + * require refreshing it: git would not be able to + * trust st_size and would have to go to the filesystem + * to see if the contents match (see ie_modified()). + * + * Instead, let's pay the overhead only once, now, and + * call checkout_entry_ca() again for this file, to + * have it's stat() data stored in the index. This also + * has the benefit of adding this entry and its + * colliding pair to the collision report message. + * Additionally, this overwriting behavior is consistent + * with what the sequential checkout does, so it doesn't + * add any extra overhead. + */ + ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca, + state, NULL, NULL); + break; + case PC_ITEM_PENDING: + have_pending = 1; + /* fall through */ + case PC_ITEM_FAILED: + ret = -1; + break; + default: + BUG("unknown checkout item status in parallel checkout"); + } + } + + if (have_pending) + error(_("parallel checkout finished with pending entries")); + + return ret; +} + +static int reset_fd(int fd, const char *path) +{ + if (lseek(fd, 0, SEEK_SET) != 0) + return error_errno("failed to rewind descriptor of %s", path); + if (ftruncate(fd, 0)) + return error_errno("failed to truncate file %s", path); + return 0; +} + +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd, + const char *path, struct checkout *state) +{ + int ret; + struct stream_filter *filter; + struct strbuf buf = STRBUF_INIT; + char *new_blob; + unsigned long size; + size_t newsize = 0; + ssize_t wrote; + + /* Sanity check */ + assert(is_eligible_for_parallel_checkout(pc_item->ce, &pc_item->ca)); + + filter = get_stream_filter_ca(&pc_item->ca, &pc_item->ce->oid); + if (filter) { + if (stream_blob_to_fd(fd, &pc_item->ce->oid, filter, 1)) { + /* On error, reset fd to try writing without streaming */ + if (reset_fd(fd, path)) + return -1; + } else { + return 0; + } + } + + new_blob = read_blob_entry(pc_item->ce, &size); + if (!new_blob) + return error("unable to read sha1 file of %s (%s)", path, + oid_to_hex(&pc_item->ce->oid)); + + /* + * checkout metadata is used to give context for external process + * filters. Files requiring such filters are not eligible for parallel + * checkout, so pass NULL. + */ + ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name, + new_blob, size, &buf, NULL); + + if (ret) { + free(new_blob); + new_blob = strbuf_detach(&buf, &newsize); + size = newsize; + } + + wrote = write_in_full(fd, new_blob, size); + free(new_blob); + if (wrote < 0) + return error("unable to write file %s", path); + + return 0; +} + +static int close_and_clear(int *fd) +{ + int ret = 0; + + if (*fd >= 0) { + ret = close(*fd); + *fd = -1; + } + + return ret; +} + +static int check_leading_dirs(const char *path, int len, int prefix_len) +{ + const char *slash = path + len; + + while (slash > path && *slash != '/') + slash--; + + return has_dirs_only_path(path, slash - path, prefix_len); +} + +static void write_pc_item(struct parallel_checkout_item *pc_item, + struct checkout *state) +{ + unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666; + int fd = -1, fstat_done = 0; + struct strbuf path = STRBUF_INIT; + + strbuf_add(&path, state->base_dir, state->base_dir_len); + strbuf_add(&path, pc_item->ce->name, pc_item->ce->ce_namelen); + + /* + * At this point, leading dirs should have already been created. But if + * a symlink being checked out has collided with one of the dirs, due to + * file system folding rules, it's possible that the dirs are no longer + * present. So we have to check again, and report any path collisions. + */ + if (!check_leading_dirs(path.buf, path.len, state->base_dir_len)) { + pc_item->status = PC_ITEM_COLLIDED; + goto out; + } + + fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, mode); + + if (fd < 0) { + if (errno == EEXIST || errno == EISDIR) { + /* + * Errors which probably represent a path collision. + * Suppress the error message and mark the item to be + * retried later, sequentially. ENOTDIR and ENOENT are + * also interesting, but check_leading_dirs() should + * have already caught these cases. + */ + pc_item->status = PC_ITEM_COLLIDED; + } else { + error_errno("failed to open file %s", path.buf); + pc_item->status = PC_ITEM_FAILED; + } + goto out; + } + + if (write_pc_item_to_fd(pc_item, fd, path.buf, state)) { + /* Error was already reported. */ + pc_item->status = PC_ITEM_FAILED; + goto out; + } + + fstat_done = fstat_checkout_output(fd, state, &pc_item->st); + + if (close_and_clear(&fd)) { + error_errno("unable to close file %s", path.buf); + pc_item->status = PC_ITEM_FAILED; + goto out; + } + + if (state->refresh_cache && !fstat_done && lstat(path.buf, &pc_item->st) < 0) { + error_errno("unable to stat just-written file %s", path.buf); + pc_item->status = PC_ITEM_FAILED; + goto out; + } + + pc_item->status = PC_ITEM_WRITTEN; + +out: + /* + * No need to check close() return. At this point, either fd is already + * closed, or we are on an error path, that has already been reported. + */ + close_and_clear(&fd); + strbuf_release(&path); +} + +static void write_items_sequentially(struct checkout *state) +{ + size_t i; + + for (i = 0; i < parallel_checkout.nr; ++i) + write_pc_item(¶llel_checkout.items[i], state); +} + +int run_parallel_checkout(struct checkout *state) +{ + int ret; + + if (parallel_checkout.status != PC_ACCEPTING_ENTRIES) + BUG("cannot run parallel checkout: uninitialized or already running"); + + parallel_checkout.status = PC_RUNNING; + + write_items_sequentially(state); + ret = handle_results(state); + + finish_parallel_checkout(); + return ret; +} diff --git a/parallel-checkout.h b/parallel-checkout.h new file mode 100644 index 0000000000..e6d6fc01ea --- /dev/null +++ b/parallel-checkout.h @@ -0,0 +1,27 @@ +#ifndef PARALLEL_CHECKOUT_H +#define PARALLEL_CHECKOUT_H + +struct cache_entry; +struct checkout; +struct conv_attrs; + +enum pc_status { + PC_UNINITIALIZED = 0, + PC_ACCEPTING_ENTRIES, + PC_RUNNING, +}; + +enum pc_status parallel_checkout_status(void); +void init_parallel_checkout(void); + +/* + * Return -1 if parallel checkout is currently not enabled or if the entry is + * not eligible for parallel checkout. Otherwise, enqueue the entry for later + * write and return 0. + */ +int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca); + +/* Write all the queued entries, returning 0 on success.*/ +int run_parallel_checkout(struct checkout *state); + +#endif /* PARALLEL_CHECKOUT_H */ diff --git a/unpack-trees.c b/unpack-trees.c index a511fadd89..1b1da7485a 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -17,6 +17,7 @@ #include "object-store.h" #include "promisor-remote.h" #include "entry.h" +#include "parallel-checkout.h" /* * Error messages expected by scripts out of plumbing commands such as @@ -438,7 +439,6 @@ static int check_updates(struct unpack_trees_options *o, if (should_update_submodules()) load_gitmodules_file(index, &state); - enable_delayed_checkout(&state); if (has_promisor_remote()) { /* * Prefetch the objects that are to be checked out in the loop @@ -461,6 +461,9 @@ static int check_updates(struct unpack_trees_options *o, to_fetch.oid, to_fetch.nr); oid_array_clear(&to_fetch); } + + enable_delayed_checkout(&state); + init_parallel_checkout(); for (i = 0; i < index->cache_nr; i++) { struct cache_entry *ce = index->cache[i]; @@ -474,6 +477,7 @@ static int check_updates(struct unpack_trees_options *o, } } stop_progress(&progress); + errs |= run_parallel_checkout(&state); errs |= finish_delayed_checkout(&state, NULL); git_attr_set_direction(GIT_ATTR_CHECKIN); From patchwork Tue Sep 22 22:49:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793519 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 078D1139A for ; Tue, 22 Sep 2020 22:51:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D62AB2076E for ; Tue, 22 Sep 2020 22:51:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="iDnrjS8H" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726822AbgIVWvI (ORCPT ); Tue, 22 Sep 2020 18:51:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726794AbgIVWvH (ORCPT ); Tue, 22 Sep 2020 18:51:07 -0400 Received: from mail-qt1-x844.google.com (mail-qt1-x844.google.com [IPv6:2607:f8b0:4864:20::844]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97C59C061755 for ; Tue, 22 Sep 2020 15:51:07 -0700 (PDT) Received: by mail-qt1-x844.google.com with SMTP id r8so17063827qtp.13 for ; Tue, 22 Sep 2020 15:51:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=v9x8knwc97Xg0lxRO/pavgDemlEGmqtRL6lUSQs1NqQ=; b=iDnrjS8Hd4836xiH93PHBF7zla9Jn7Dsl5xdXOiOkKvnnTHuLwHDuSXVBepSe0Dlml JBGTbFSqebTzGImGt62R+UTWTa6YfRaRv+8aIvBknA2OeiuHe649B62mk0OVvv04BH2C 5P3CdvZxDafCQ18i6XaZoGA2Ao7akDF4TMXirkTe+mZ8tDbWCfsMUKN04e+fBa11xesS qMooHf7kiU0RWCI/+CFmA9O08bHoik/p+j5c0UhnsIqJAqpmsFip6K7K3GkdSh09cdEa jWHFtr42gD6JanwIIp+NWXbkp2V+QrGqiFCmR8rlc2rYWPMetQ5RHPZ4RsJS8MI7+cMx S9rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=v9x8knwc97Xg0lxRO/pavgDemlEGmqtRL6lUSQs1NqQ=; b=MVpS0A+HadIrxr12ISwiMndh1CodrSsDQ+zjVffYygTz3nczJKFNq2iuW6kcpGCszh YM7/gKA8kzjY7M3WuA1BKlOL5QNk/YxesLKnCk9a+rjrYNWbmrfA51WUEEETKlDSe6bl n5LzDh3WZIjlQc6kc8uh5+8Sa7F/3w+iEg5MSq6bbC53REd2NpYPubbOAKnKBP2U9nba nELyGp4ylnGt7N6QWx9gsTrOxYc5YZmSyaxYhvA+Fw2T8vhxuEKGLbthvMecJ0e2XSWz UBclSqADJ7JGDRXjmy1J6Krdre/MU7eu8XRBicyFi8DKpGLU54oXwM4hfRZJBwvVUzPr dOFg== X-Gm-Message-State: AOAM531W9lCQrYn2GE/Knjxrx5Med1Dr4PRLyLf5eZGfmgN+uNyGAwag VbqvYmbgxU4YV9xlBPRHXuuPMrRx+5aPqw== X-Google-Smtp-Source: ABdhPJyaIxPIcLWRJC0jqc5BDILdk5jCQdfFhgjf1SArOqRMhrf9c3G2T8TOcorUQMtZCN7o+80OKw== X-Received: by 2002:ac8:794c:: with SMTP id r12mr7206065qtt.162.1600815065791; Tue, 22 Sep 2020 15:51:05 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:05 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 11/19] parallel-checkout: make it truly parallel Date: Tue, 22 Sep 2020 19:49:25 -0300 Message-Id: <991169488b17e492c6d2c2f212267a66693aa7ec.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Use multiple worker processes to distribute the queued entries and call write_checkout_item() in parallel for them. The items are distributed uniformly in contiguous chunks. This minimizes the chances of two workers writing to the same directory simultaneously, which could affect performance due to lock contention in the kernel. Work stealing (or any other format of re-distribution) is not implemented yet. The parallel version was benchmarked during three operations in the linux repo, with cold cache: cloning v5.8, checking out v5.8 from v2.6.15 (checkout I) and checking out v5.8 from v5.7 (checkout II). The four tables below show the mean run times and standard deviations for 5 runs in: a local file system with SSD, a local file system with HDD, a Linux NFS server, and Amazon EFS. The numbers of workers were chosen based on what produces the best result for each case. Local SSD: Clone Checkout I Checkout II Sequential 8.171 s ± 0.206 s 8.735 s ± 0.230 s 4.166 s ± 0.246 s 10 workers 3.277 s ± 0.138 s 3.774 s ± 0.188 s 2.561 s ± 0.120 s Speedup 2.49 ± 0.12 2.31 ± 0.13 1.63 ± 0.12 Local HDD: Clone Checkout I Checkout II Sequential 35.157 s ± 0.205 s 48.835 s ± 0.407 s 47.302 s ± 1.435 s 8 workers 35.538 s ± 0.325 s 49.353 s ± 0.826 s 48.919 s ± 0.416 s Speedup 0.99 ± 0.01 0.99 ± 0.02 0.97 ± 0.03 Linux NFS server (v4.1, on EBS, single availability zone): Clone Checkout I Checkout II Sequential 216.070 s ± 3.611 s 211.169 s ± 3.147 s 57.446 s ± 1.301 s 32 workers 67.997 s ± 0.740 s 66.563 s ± 0.457 s 23.708 s ± 0.622 s Speedup 3.18 ± 0.06 3.17 ± 0.05 2.42 ± 0.08 EFS (v4.1, replicated over multiple availability zones): Clone Checkout I Checkout II Sequential 1249.329 s ± 13.857 s 1438.979 s ± 78.792 s 543.919 s ± 18.745 s 64 workers 225.864 s ± 12.433 s 316.345 s ± 1.887 s 183.648 s ± 10.095 s Speedup 5.53 ± 0.31 4.55 ± 0.25 2.96 ± 0.19 The above benchmarks show that parallel checkout is most effective on repositories located on an SSD or over a distributed file system. For local file systems on spinning disks, and/or older machines, the parallelism does not always bring a good performance. In fact, it can even increase the run time. For this reason, the sequential code is still the default. Two settings are added to optionally enable and configure the new parallel version as desired. Local SSD tests were executed in an i7-7700HQ (4 cores with hyper-threading) running Manjaro Linux. Local HDD tests were executed in an i7-2600 (also 4 cores with hyper-threading), HDD Seagate Barracuda 7200 rpm SATA 3.0, running Debian 9.13. NFS and EFS tests were executed in an Amazon EC2 c5n.large instance, with 2 vCPUs. The Linux NFS server was running on a m6g.large instance with 1 TB, EBS GP2 volume. Before each timing, the linux repository was removed (or checked out back), and `sync && sysctl vm.drop_caches=3` was executed. Co-authored-by: Nguyễn Thái Ngọc Duy Co-authored-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- .gitignore | 1 + Documentation/config/checkout.txt | 21 +++ Makefile | 1 + builtin.h | 1 + builtin/checkout--helper.c | 142 ++++++++++++++++ git.c | 2 + parallel-checkout.c | 273 +++++++++++++++++++++++++++--- parallel-checkout.h | 84 ++++++++- unpack-trees.c | 10 +- 9 files changed, 501 insertions(+), 34 deletions(-) create mode 100644 builtin/checkout--helper.c diff --git a/.gitignore b/.gitignore index d0f692a355..6427739814 100644 --- a/.gitignore +++ b/.gitignore @@ -33,6 +33,7 @@ /git-check-mailmap /git-check-ref-format /git-checkout +/git-checkout--helper /git-checkout-index /git-cherry /git-cherry-pick diff --git a/Documentation/config/checkout.txt b/Documentation/config/checkout.txt index 6b646813ab..44eb58bcd3 100644 --- a/Documentation/config/checkout.txt +++ b/Documentation/config/checkout.txt @@ -16,3 +16,24 @@ will checkout the '' branch on another remote, and by linkgit:git-worktree[1] when 'git worktree add' refers to a remote branch. This setting might be used for other checkout-like commands or functionality in the future. + +checkout.workers:: + The number of parallel workers to use when updating the working tree. + The default is one, i.e. sequential execution. If set to a value less + than one, Git will use as many workers as the number of logical cores + available. This setting and checkout.thresholdForParallelism affect all + commands that perform checkout. E.g. checkout, switch, clone, reset, + sparse-checkout, read-tree, etc. ++ +Note: parallel checkout usually delivers better performance for repositories +located on SSDs or over NFS. For repositories on spinning disks and/or machines +with a small number of cores, the default sequential checkout often performs +better. The size and compression level of a repository might also influence how +well the parallel version performs. + +checkout.thresholdForParallelism:: + When running parallel checkout with a small number of files, the cost + of subprocess spawning and inter-process communication might outweigh + the parallelization gains. This setting allows to define the minimum + number of files for which parallel checkout should be attempted. The + default is 100. diff --git a/Makefile b/Makefile index 3edcdc534c..e9c6616180 100644 --- a/Makefile +++ b/Makefile @@ -1049,6 +1049,7 @@ BUILTIN_OBJS += builtin/check-attr.o BUILTIN_OBJS += builtin/check-ignore.o BUILTIN_OBJS += builtin/check-mailmap.o BUILTIN_OBJS += builtin/check-ref-format.o +BUILTIN_OBJS += builtin/checkout--helper.o BUILTIN_OBJS += builtin/checkout-index.o BUILTIN_OBJS += builtin/checkout.o BUILTIN_OBJS += builtin/clean.o diff --git a/builtin.h b/builtin.h index ba954e180c..b52243848d 100644 --- a/builtin.h +++ b/builtin.h @@ -123,6 +123,7 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix); int cmd_bundle(int argc, const char **argv, const char *prefix); int cmd_cat_file(int argc, const char **argv, const char *prefix); int cmd_checkout(int argc, const char **argv, const char *prefix); +int cmd_checkout__helper(int argc, const char **argv, const char *prefix); int cmd_checkout_index(int argc, const char **argv, const char *prefix); int cmd_check_attr(int argc, const char **argv, const char *prefix); int cmd_check_ignore(int argc, const char **argv, const char *prefix); diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c new file mode 100644 index 0000000000..67fe37cf11 --- /dev/null +++ b/builtin/checkout--helper.c @@ -0,0 +1,142 @@ +#include "builtin.h" +#include "config.h" +#include "entry.h" +#include "parallel-checkout.h" +#include "parse-options.h" +#include "pkt-line.h" + +static void packet_to_pc_item(char *line, int len, + struct parallel_checkout_item *pc_item) +{ + struct pc_item_fixed_portion *fixed_portion; + char *encoding, *variant; + + if (len < sizeof(struct pc_item_fixed_portion)) + BUG("checkout worker received too short item (got %dB, exp %dB)", + len, (int)sizeof(struct pc_item_fixed_portion)); + + fixed_portion = (struct pc_item_fixed_portion *)line; + + if (len - sizeof(struct pc_item_fixed_portion) != + fixed_portion->name_len + fixed_portion->working_tree_encoding_len) + BUG("checkout worker received corrupted item"); + + variant = line + sizeof(struct pc_item_fixed_portion); + + /* + * Note: the main process uses zero length to communicate that the + * encoding is NULL. There is no use case in actually sending an empty + * string since it's considered as NULL when ca.working_tree_encoding + * is set at git_path_check_encoding(). + */ + if (fixed_portion->working_tree_encoding_len) { + encoding = xmemdupz(variant, + fixed_portion->working_tree_encoding_len); + variant += fixed_portion->working_tree_encoding_len; + } else { + encoding = NULL; + } + + memset(pc_item, 0, sizeof(*pc_item)); + pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len); + pc_item->ce->ce_namelen = fixed_portion->name_len; + pc_item->ce->ce_mode = fixed_portion->ce_mode; + memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen); + oidcpy(&pc_item->ce->oid, &fixed_portion->oid); + + pc_item->id = fixed_portion->id; + pc_item->ca.crlf_action = fixed_portion->crlf_action; + pc_item->ca.ident = fixed_portion->ident; + pc_item->ca.working_tree_encoding = encoding; +} + +static void report_result(struct parallel_checkout_item *pc_item) +{ + struct pc_item_result res = { 0 }; + size_t size; + + res.id = pc_item->id; + res.status = pc_item->status; + + if (pc_item->status == PC_ITEM_WRITTEN) { + res.st = pc_item->st; + size = sizeof(res); + } else { + size = PC_ITEM_RESULT_BASE_SIZE; + } + + packet_write(1, (const char *)&res, size); +} + +/* Free the worker-side malloced data, but not pc_item itself. */ +static void release_pc_item_data(struct parallel_checkout_item *pc_item) +{ + free((char *)pc_item->ca.working_tree_encoding); + discard_cache_entry(pc_item->ce); +} + +static void worker_loop(struct checkout *state) +{ + struct parallel_checkout_item *items = NULL; + size_t i, nr = 0, alloc = 0; + + while (1) { + int len; + char *line = packet_read_line(0, &len); + + if (!line) + break; + + ALLOC_GROW(items, nr + 1, alloc); + packet_to_pc_item(line, len, &items[nr++]); + } + + for (i = 0; i < nr; ++i) { + struct parallel_checkout_item *pc_item = &items[i]; + write_pc_item(pc_item, state); + report_result(pc_item); + release_pc_item_data(pc_item); + } + + packet_flush(1); + + free(items); +} + +static const char * const checkout_helper_usage[] = { + N_("git checkout--helper []"), + NULL +}; + +int cmd_checkout__helper(int argc, const char **argv, const char *prefix) +{ + struct checkout state = CHECKOUT_INIT; + struct option checkout_helper_options[] = { + OPT_STRING(0, "prefix", &state.base_dir, N_("string"), + N_("when creating files, prepend ")), + OPT_END() + }; + + if (argc == 2 && !strcmp(argv[1], "-h")) + usage_with_options(checkout_helper_usage, + checkout_helper_options); + + git_config(git_default_config, NULL); + argc = parse_options(argc, argv, prefix, checkout_helper_options, + checkout_helper_usage, 0); + if (argc > 0) + usage_with_options(checkout_helper_usage, checkout_helper_options); + + if (state.base_dir) + state.base_dir_len = strlen(state.base_dir); + + /* + * Setting this on worker won't actually update the index. We just need + * to pretend so to induce the checkout machinery to stat() the written + * entries. + */ + state.refresh_cache = 1; + + worker_loop(&state); + return 0; +} diff --git a/git.c b/git.c index 01c456edce..a09357fc56 100644 --- a/git.c +++ b/git.c @@ -487,6 +487,8 @@ static struct cmd_struct commands[] = { { "check-mailmap", cmd_check_mailmap, RUN_SETUP }, { "check-ref-format", cmd_check_ref_format, NO_PARSEOPT }, { "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE }, + { "checkout--helper", cmd_checkout__helper, + RUN_SETUP | NEED_WORK_TREE | SUPPORT_SUPER_PREFIX }, { "checkout-index", cmd_checkout_index, RUN_SETUP | NEED_WORK_TREE}, { "cherry", cmd_cherry, RUN_SETUP }, diff --git a/parallel-checkout.c b/parallel-checkout.c index 7dc8ab2a67..7ea0faa526 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -1,28 +1,15 @@ #include "cache.h" #include "entry.h" #include "parallel-checkout.h" +#include "pkt-line.h" +#include "run-command.h" #include "streaming.h" +#include "thread-utils.h" +#include "config.h" -enum pc_item_status { - PC_ITEM_PENDING = 0, - PC_ITEM_WRITTEN, - /* - * The entry could not be written because there was another file - * already present in its path or leading directories. Since - * checkout_entry_ca() removes such files from the working tree before - * enqueueing the entry for parallel checkout, it means that there was - * a path collision among the entries being written. - */ - PC_ITEM_COLLIDED, - PC_ITEM_FAILED, -}; - -struct parallel_checkout_item { - /* pointer to a istate->cache[] entry. Not owned by us. */ - struct cache_entry *ce; - struct conv_attrs ca; - struct stat st; - enum pc_item_status status; +struct pc_worker { + struct child_process cp; + size_t next_to_complete, nr_to_complete; }; struct parallel_checkout { @@ -38,6 +25,19 @@ enum pc_status parallel_checkout_status(void) return parallel_checkout.status; } +#define DEFAULT_THRESHOLD_FOR_PARALLELISM 100 + +void get_parallel_checkout_configs(int *num_workers, int *threshold) +{ + if (git_config_get_int("checkout.workers", num_workers)) + *num_workers = 1; + else if (*num_workers < 1) + *num_workers = online_cpus(); + + if (git_config_get_int("checkout.thresholdForParallelism", threshold)) + *threshold = DEFAULT_THRESHOLD_FOR_PARALLELISM; +} + void init_parallel_checkout(void) { if (parallel_checkout.status != PC_UNINITIALIZED) @@ -115,10 +115,12 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca) ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1, parallel_checkout.alloc); - pc_item = ¶llel_checkout.items[parallel_checkout.nr++]; + pc_item = ¶llel_checkout.items[parallel_checkout.nr]; pc_item->ce = ce; memcpy(&pc_item->ca, ca, sizeof(pc_item->ca)); pc_item->status = PC_ITEM_PENDING; + pc_item->id = parallel_checkout.nr; + parallel_checkout.nr++; return 0; } @@ -231,7 +233,8 @@ static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd, /* * checkout metadata is used to give context for external process * filters. Files requiring such filters are not eligible for parallel - * checkout, so pass NULL. + * checkout, so pass NULL. Note: if that changes, the metadata must also + * be passed from the main process to the workers. */ ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name, new_blob, size, &buf, NULL); @@ -272,8 +275,8 @@ static int check_leading_dirs(const char *path, int len, int prefix_len) return has_dirs_only_path(path, slash - path, prefix_len); } -static void write_pc_item(struct parallel_checkout_item *pc_item, - struct checkout *state) +void write_pc_item(struct parallel_checkout_item *pc_item, + struct checkout *state) { unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666; int fd = -1, fstat_done = 0; @@ -343,6 +346,214 @@ static void write_pc_item(struct parallel_checkout_item *pc_item, strbuf_release(&path); } +static void send_one_item(int fd, struct parallel_checkout_item *pc_item) +{ + size_t len_data; + char *data, *variant; + struct pc_item_fixed_portion *fixed_portion; + const char *working_tree_encoding = pc_item->ca.working_tree_encoding; + size_t name_len = pc_item->ce->ce_namelen; + size_t working_tree_encoding_len = working_tree_encoding ? + strlen(working_tree_encoding) : 0; + + len_data = sizeof(struct pc_item_fixed_portion) + name_len + + working_tree_encoding_len; + + data = xcalloc(1, len_data); + + fixed_portion = (struct pc_item_fixed_portion *)data; + fixed_portion->id = pc_item->id; + oidcpy(&fixed_portion->oid, &pc_item->ce->oid); + fixed_portion->ce_mode = pc_item->ce->ce_mode; + fixed_portion->crlf_action = pc_item->ca.crlf_action; + fixed_portion->ident = pc_item->ca.ident; + fixed_portion->name_len = name_len; + fixed_portion->working_tree_encoding_len = working_tree_encoding_len; + + variant = data + sizeof(*fixed_portion); + if (working_tree_encoding_len) { + memcpy(variant, working_tree_encoding, working_tree_encoding_len); + variant += working_tree_encoding_len; + } + memcpy(variant, pc_item->ce->name, name_len); + + packet_write(fd, data, len_data); + + free(data); +} + +static void send_batch(int fd, size_t start, size_t nr) +{ + size_t i; + for (i = 0; i < nr; ++i) + send_one_item(fd, ¶llel_checkout.items[start + i]); + packet_flush(fd); +} + +static struct pc_worker *setup_workers(struct checkout *state, int num_workers) +{ + struct pc_worker *workers; + int i, workers_with_one_extra_item; + size_t base_batch_size, next_to_assign = 0; + + ALLOC_ARRAY(workers, num_workers); + + for (i = 0; i < num_workers; ++i) { + struct child_process *cp = &workers[i].cp; + + child_process_init(cp); + cp->git_cmd = 1; + cp->in = -1; + cp->out = -1; + cp->clean_on_exit = 1; + strvec_push(&cp->args, "checkout--helper"); + if (state->base_dir_len) + strvec_pushf(&cp->args, "--prefix=%s", state->base_dir); + if (start_command(cp)) + die(_("failed to spawn checkout worker")); + } + + base_batch_size = parallel_checkout.nr / num_workers; + workers_with_one_extra_item = parallel_checkout.nr % num_workers; + + for (i = 0; i < num_workers; ++i) { + struct pc_worker *worker = &workers[i]; + size_t batch_size = base_batch_size; + + /* distribute the extra work evenly */ + if (i < workers_with_one_extra_item) + batch_size++; + + send_batch(worker->cp.in, next_to_assign, batch_size); + worker->next_to_complete = next_to_assign; + worker->nr_to_complete = batch_size; + + next_to_assign += batch_size; + } + + return workers; +} + +static void finish_workers(struct pc_worker *workers, int num_workers) +{ + int i; + + /* + * Close pipes before calling finish_command() to let the workers + * exit asynchronously and avoid spending extra time on wait(). + */ + for (i = 0; i < num_workers; ++i) { + struct child_process *cp = &workers[i].cp; + if (cp->in >= 0) + close(cp->in); + if (cp->out >= 0) + close(cp->out); + } + + for (i = 0; i < num_workers; ++i) { + if (finish_command(&workers[i].cp)) + error(_("checkout worker %d finished with error"), i); + } + + free(workers); +} + +#define ASSERT_PC_ITEM_RESULT_SIZE(got, exp) \ +{ \ + if (got != exp) \ + BUG("corrupted result from checkout worker (got %dB, exp %dB)", \ + got, exp); \ +} while(0) + +static void parse_and_save_result(const char *line, int len, + struct pc_worker *worker) +{ + struct pc_item_result *res; + struct parallel_checkout_item *pc_item; + struct stat *st = NULL; + + if (len < PC_ITEM_RESULT_BASE_SIZE) + BUG("too short result from checkout worker (got %dB, exp %dB)", + len, (int)PC_ITEM_RESULT_BASE_SIZE); + + res = (struct pc_item_result *)line; + + /* + * Worker should send either the full result struct on success, or + * just the base (i.e. no stat data), otherwise. + */ + if (res->status == PC_ITEM_WRITTEN) { + ASSERT_PC_ITEM_RESULT_SIZE(len, (int)sizeof(struct pc_item_result)); + st = &res->st; + } else { + ASSERT_PC_ITEM_RESULT_SIZE(len, (int)PC_ITEM_RESULT_BASE_SIZE); + } + + if (!worker->nr_to_complete || res->id != worker->next_to_complete) + BUG("checkout worker sent unexpected item id"); + + worker->next_to_complete++; + worker->nr_to_complete--; + + pc_item = ¶llel_checkout.items[res->id]; + pc_item->status = res->status; + if (st) + pc_item->st = *st; +} + + +static void gather_results_from_workers(struct pc_worker *workers, + int num_workers) +{ + int i, active_workers = num_workers; + struct pollfd *pfds; + + CALLOC_ARRAY(pfds, num_workers); + for (i = 0; i < num_workers; ++i) { + pfds[i].fd = workers[i].cp.out; + pfds[i].events = POLLIN; + } + + while (active_workers) { + int nr = poll(pfds, num_workers, -1); + + if (nr < 0) { + if (errno == EINTR) + continue; + die_errno("failed to poll checkout workers"); + } + + for (i = 0; i < num_workers && nr > 0; ++i) { + struct pc_worker *worker = &workers[i]; + struct pollfd *pfd = &pfds[i]; + + if (!pfd->revents) + continue; + + if (pfd->revents & POLLIN) { + int len; + const char *line = packet_read_line(pfd->fd, &len); + + if (!line) { + pfd->fd = -1; + active_workers--; + } else { + parse_and_save_result(line, len, worker); + } + } else if (pfd->revents & POLLHUP) { + pfd->fd = -1; + active_workers--; + } else if (pfd->revents & (POLLNVAL | POLLERR)) { + die(_("error polling from checkout worker")); + } + + nr--; + } + } + + free(pfds); +} + static void write_items_sequentially(struct checkout *state) { size_t i; @@ -351,7 +562,7 @@ static void write_items_sequentially(struct checkout *state) write_pc_item(¶llel_checkout.items[i], state); } -int run_parallel_checkout(struct checkout *state) +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold) { int ret; @@ -360,7 +571,17 @@ int run_parallel_checkout(struct checkout *state) parallel_checkout.status = PC_RUNNING; - write_items_sequentially(state); + if (parallel_checkout.nr < num_workers) + num_workers = parallel_checkout.nr; + + if (num_workers <= 1 || parallel_checkout.nr < threshold) { + write_items_sequentially(state); + } else { + struct pc_worker *workers = setup_workers(state, num_workers); + gather_results_from_workers(workers, num_workers); + finish_workers(workers, num_workers); + } + ret = handle_results(state); finish_parallel_checkout(); diff --git a/parallel-checkout.h b/parallel-checkout.h index e6d6fc01ea..0c9984584e 100644 --- a/parallel-checkout.h +++ b/parallel-checkout.h @@ -1,9 +1,12 @@ #ifndef PARALLEL_CHECKOUT_H #define PARALLEL_CHECKOUT_H -struct cache_entry; -struct checkout; -struct conv_attrs; +#include "entry.h" +#include "convert.h" + +/**************************************************************** + * Users of parallel checkout + ****************************************************************/ enum pc_status { PC_UNINITIALIZED = 0, @@ -12,6 +15,7 @@ enum pc_status { }; enum pc_status parallel_checkout_status(void); +void get_parallel_checkout_configs(int *num_workers, int *threshold); void init_parallel_checkout(void); /* @@ -21,7 +25,77 @@ void init_parallel_checkout(void); */ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca); -/* Write all the queued entries, returning 0 on success.*/ -int run_parallel_checkout(struct checkout *state); +/* + * Write all the queued entries, returning 0 on success. If the number of + * entries is smaller than the specified threshold, the operation is performed + * sequentially. + */ +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold); + +/**************************************************************** + * Interface with checkout--helper + ****************************************************************/ + +enum pc_item_status { + PC_ITEM_PENDING = 0, + PC_ITEM_WRITTEN, + /* + * The entry could not be written because there was another file + * already present in its path or leading directories. Since + * checkout_entry_ca() removes such files from the working tree before + * enqueueing the entry for parallel checkout, it means that there was + * a path collision among the entries being written. + */ + PC_ITEM_COLLIDED, + PC_ITEM_FAILED, +}; + +struct parallel_checkout_item { + /* + * In main process ce points to a istate->cache[] entry. Thus, it's not + * owned by us. In workers they own the memory, which *must be* released. + */ + struct cache_entry *ce; + struct conv_attrs ca; + size_t id; /* position in parallel_checkout.items[] of main process */ + + /* Output fields, sent from workers. */ + enum pc_item_status status; + struct stat st; +}; + +/* + * The fixed-size portion of `struct parallel_checkout_item` that is sent to the + * workers. Following this will be 2 strings: ca.working_tree_encoding and + * ce.name; These are NOT null terminated, since we have the size in the fixed + * portion. + * + * Note that not all fields of conv_attrs and cache_entry are passed, only the + * ones that will be required by the workers to smudge and write the entry. + */ +struct pc_item_fixed_portion { + size_t id; + struct object_id oid; + unsigned int ce_mode; + enum crlf_action crlf_action; + int ident; + size_t working_tree_encoding_len; + size_t name_len; +}; + +/* + * The fields of `struct parallel_checkout_item` that are returned by the + * workers. Note: `st` must be the last one, as it is omitted on error. + */ +struct pc_item_result { + size_t id; + enum pc_item_status status; + struct stat st; +}; + +#define PC_ITEM_RESULT_BASE_SIZE offsetof(struct pc_item_result, st) + +void write_pc_item(struct parallel_checkout_item *pc_item, + struct checkout *state); #endif /* PARALLEL_CHECKOUT_H */ diff --git a/unpack-trees.c b/unpack-trees.c index 1b1da7485a..117ed42370 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -399,7 +399,7 @@ static int check_updates(struct unpack_trees_options *o, int errs = 0; struct progress *progress; struct checkout state = CHECKOUT_INIT; - int i; + int i, pc_workers, pc_threshold; trace_performance_enter(); state.force = 1; @@ -462,8 +462,11 @@ static int check_updates(struct unpack_trees_options *o, oid_array_clear(&to_fetch); } + get_parallel_checkout_configs(&pc_workers, &pc_threshold); + enable_delayed_checkout(&state); - init_parallel_checkout(); + if (pc_workers > 1) + init_parallel_checkout(); for (i = 0; i < index->cache_nr; i++) { struct cache_entry *ce = index->cache[i]; @@ -477,7 +480,8 @@ static int check_updates(struct unpack_trees_options *o, } } stop_progress(&progress); - errs |= run_parallel_checkout(&state); + if (pc_workers > 1) + errs |= run_parallel_checkout(&state, pc_workers, pc_threshold); errs |= finish_delayed_checkout(&state, NULL); git_attr_set_direction(GIT_ATTR_CHECKIN); From patchwork Tue Sep 22 22:49:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793521 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5B449139A for ; Tue, 22 Sep 2020 22:51:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 416E82076E for ; Tue, 22 Sep 2020 22:51:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="fILpvfYZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726829AbgIVWvK (ORCPT ); Tue, 22 Sep 2020 18:51:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726617AbgIVWvK (ORCPT ); Tue, 22 Sep 2020 18:51:10 -0400 Received: from mail-qt1-x842.google.com (mail-qt1-x842.google.com [IPv6:2607:f8b0:4864:20::842]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06743C061755 for ; Tue, 22 Sep 2020 15:51:10 -0700 (PDT) Received: by mail-qt1-x842.google.com with SMTP id b2so17052491qtp.8 for ; Tue, 22 Sep 2020 15:51:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oG2TFWae6qgA8+hB1Rk7fweaT00Ae9ltZ2welbLce+Q=; b=fILpvfYZ0rB+Oe/9wQrYEiR1QUi+SA87YiH5OVafsnqFczl4Y2mjvWX7ltWpPIJ7X6 EhHgqkrsb+kyUuzZWLmo1IVcOOuIyWOrYazzwMusD7Nt2vinOYL0NIZC8HkX2Z8cjKvy X9iFP2soDc7jojWGGW14g9/2cRGi85Jv8ByhhPEUAtYOJNNG+9ei6l1L0KJTklRa/Ypr itvrRnEZ+YfMWsNgiyDt3/FWmTEgVoot6MtUfkAw9YTYMf+xZih0NiGzGr1Zc/nw7+m5 H66IuodLFmff2TIJfYkfKQToGHwEwbAtKJM4hZhgOs/aURQhXDD9whXg3LhriwqmKUua je3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oG2TFWae6qgA8+hB1Rk7fweaT00Ae9ltZ2welbLce+Q=; b=SAg5sd+sZiXfcVYnVd+aukE34CrYWlWAs7+1W8pkA7sQnrlhcwKqzSc/YZksKeRuuk gifE7MD7LS8BnaXNlfZRXuL7qlRmXrxl+HL3hSHmoEzuFw7iz/TOdjE6OGTl3TmXaf/A xoWF3Z2a0wof4BZ/OMccsPev200PpNWmTJUA6K3it6kOT9R5LcOGuPr4r5lP64I5C1Up GVGDJWacAUhs4HYjxiBScnzWzci9RDohquS1vf54qMUdweX1Jprv2YXV4lKGhX0bWk9G 3XrsAf4BQV6j8DhofUF5IxGm1bWYs2A2uR3iOQcozEQh+ELYieWTOnzVJFXAFALAfK19 hLxg== X-Gm-Message-State: AOAM5327enTVYRvC6W/opwuIKeFVoxMyAhoKFmJ5ZXwz4FYAY7NRmY73 Ryb4p0P5Lbef/EndHdFstVZyuphIlgPgcw== X-Google-Smtp-Source: ABdhPJySvvQxj0zMQ87zNd+d4DELKYFEyOI1g0ObceXSKTPcernjEceh5mnFlBMlNmUA5wzTHdDHGw== X-Received: by 2002:ac8:5d0d:: with SMTP id f13mr7143150qtx.87.1600815068783; Tue, 22 Sep 2020 15:51:08 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:07 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 12/19] parallel-checkout: support progress displaying Date: Tue, 22 Sep 2020 19:49:26 -0300 Message-Id: <7ceadf2427b5c9b04c0943c0b257b8ebcaa13f19.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Original-patch-by: Nguyễn Thái Ngọc Duy Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Matheus Tavares --- parallel-checkout.c | 34 +++++++++++++++++++++++++++++++--- parallel-checkout.h | 4 +++- unpack-trees.c | 11 ++++++++--- 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/parallel-checkout.c b/parallel-checkout.c index 7ea0faa526..5156b14c53 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -2,6 +2,7 @@ #include "entry.h" #include "parallel-checkout.h" #include "pkt-line.h" +#include "progress.h" #include "run-command.h" #include "streaming.h" #include "thread-utils.h" @@ -16,6 +17,8 @@ struct parallel_checkout { enum pc_status status; struct parallel_checkout_item *items; size_t nr, alloc; + struct progress *progress; + unsigned int *progress_cnt; }; static struct parallel_checkout parallel_checkout = { 0 }; @@ -125,6 +128,20 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca) return 0; } +size_t pc_queue_size(void) +{ + return parallel_checkout.nr; +} + +static void advance_progress_meter(void) +{ + if (parallel_checkout.progress) { + (*parallel_checkout.progress_cnt)++; + display_progress(parallel_checkout.progress, + *parallel_checkout.progress_cnt); + } +} + static int handle_results(struct checkout *state) { int ret = 0; @@ -173,6 +190,7 @@ static int handle_results(struct checkout *state) */ ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca, state, NULL, NULL); + advance_progress_meter(); break; case PC_ITEM_PENDING: have_pending = 1; @@ -499,6 +517,9 @@ static void parse_and_save_result(const char *line, int len, pc_item->status = res->status; if (st) pc_item->st = *st; + + if (res->status != PC_ITEM_COLLIDED) + advance_progress_meter(); } @@ -558,11 +579,16 @@ static void write_items_sequentially(struct checkout *state) { size_t i; - for (i = 0; i < parallel_checkout.nr; ++i) - write_pc_item(¶llel_checkout.items[i], state); + for (i = 0; i < parallel_checkout.nr; ++i) { + struct parallel_checkout_item *pc_item = ¶llel_checkout.items[i]; + write_pc_item(pc_item, state); + if (pc_item->status != PC_ITEM_COLLIDED) + advance_progress_meter(); + } } -int run_parallel_checkout(struct checkout *state, int num_workers, int threshold) +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold, + struct progress *progress, unsigned int *progress_cnt) { int ret; @@ -570,6 +596,8 @@ int run_parallel_checkout(struct checkout *state, int num_workers, int threshold BUG("cannot run parallel checkout: uninitialized or already running"); parallel_checkout.status = PC_RUNNING; + parallel_checkout.progress = progress; + parallel_checkout.progress_cnt = progress_cnt; if (parallel_checkout.nr < num_workers) num_workers = parallel_checkout.nr; diff --git a/parallel-checkout.h b/parallel-checkout.h index 0c9984584e..6c3a016c0b 100644 --- a/parallel-checkout.h +++ b/parallel-checkout.h @@ -24,13 +24,15 @@ void init_parallel_checkout(void); * write and return 0. */ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca); +size_t pc_queue_size(void); /* * Write all the queued entries, returning 0 on success. If the number of * entries is smaller than the specified threshold, the operation is performed * sequentially. */ -int run_parallel_checkout(struct checkout *state, int num_workers, int threshold); +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold, + struct progress *progress, unsigned int *progress_cnt); /**************************************************************** * Interface with checkout--helper diff --git a/unpack-trees.c b/unpack-trees.c index 117ed42370..e05e6ceff2 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -471,17 +471,22 @@ static int check_updates(struct unpack_trees_options *o, struct cache_entry *ce = index->cache[i]; if (ce->ce_flags & CE_UPDATE) { + size_t last_pc_queue_size = pc_queue_size(); + if (ce->ce_flags & CE_WT_REMOVE) BUG("both update and delete flags are set on %s", ce->name); - display_progress(progress, ++cnt); ce->ce_flags &= ~CE_UPDATE; errs |= checkout_entry(ce, &state, NULL, NULL); + + if (last_pc_queue_size == pc_queue_size()) + display_progress(progress, ++cnt); } } - stop_progress(&progress); if (pc_workers > 1) - errs |= run_parallel_checkout(&state, pc_workers, pc_threshold); + errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, + progress, &cnt); + stop_progress(&progress); errs |= finish_delayed_checkout(&state, NULL); git_attr_set_direction(GIT_ATTR_CHECKIN); From patchwork Tue Sep 22 22:49:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793523 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3A17C59D for ; Tue, 22 Sep 2020 22:51:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1A22E20BED for ; Tue, 22 Sep 2020 22:51:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="jytwaKku" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726831AbgIVWvN (ORCPT ); Tue, 22 Sep 2020 18:51:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726722AbgIVWvM (ORCPT ); Tue, 22 Sep 2020 18:51:12 -0400 Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com [IPv6:2607:f8b0:4864:20::841]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC52BC061755 for ; Tue, 22 Sep 2020 15:51:12 -0700 (PDT) Received: by mail-qt1-x841.google.com with SMTP id b2so17052579qtp.8 for ; Tue, 22 Sep 2020 15:51:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WQQfrIVtMpimuOF0RigPUVD9jmZYRZ7hqXsqIZy33Y4=; b=jytwaKkuM839n+tja9nItU/AiieWgSyF+0FlRjwuTlebpNblxxlhex65Nq2vQ3KeqH r9FV4jiuoCjKSd0/7oZcqFjFzJSAVhWgrMnOjJvWva8KTxff2yZ/XQMBzoGedLi1ZrM7 nYUbscqf/n5xziBmy43+W/Imc1RerS+iWZ+XagBrzhXY3SObQJVyz0atQCkJgea/EyH9 z4J4ZQg8vkPC0+AH7g2UTfFkXkW6n8Q+aHkOLBeCUZ0NliaDMsHNMbcqFICQtcU+IPcC vwnyuCRHwuH6YyhfpI+887khT/uDAy9wfYme8VoSvl7K1lNMMQpbIM8D71OUyVl9sS04 2Ymg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WQQfrIVtMpimuOF0RigPUVD9jmZYRZ7hqXsqIZy33Y4=; b=RGPuH1NoHtaYXbY0XfWQgfBkkva8QPbu7kL7HJscvvQoYx+YEkbcZlfv36mjl8IpIp 20+RZbL+5TeXhnFRhzDQQIqvQH5Z2SsyuDo5b+85poNZX8hUqq/hMZXYXCMjV6Zx2gU+ GBf2ndKScn/ixwH0fAW8XAsJsJcUuli+kAbXt4U1fqxi6Yeta/3bhotTrANKpFtcfW11 KtVILjwKlGo57haCJUDInHuIvjVfnag03szdyD90n3r2hUgzdVexfUpOjtXt3cg6MJJ4 RE3ZAHlcFgWmzURRJgZ87JtwxjSc7XYWaWl4qU8lUqkQ5kcmjzfSUqIoMYsqeIJu+Ssl dliQ== X-Gm-Message-State: AOAM531hS8wu7Ug7sa8sDCxi12WtInKmU4uUvucDhHzsGvyQ3qCGRQ03 anPeqM/rs4DnsrHim2TR7BrdoFzpybQ+wQ== X-Google-Smtp-Source: ABdhPJyo8zsBMvrcexlHunj8xU5ZKO3MOkDf9lOG6CJ2Dq5WXlSxp5SOO1pvm09hOWMFbmTmU2Z9bg== X-Received: by 2002:ac8:f23:: with SMTP id e32mr7181735qtk.168.1600815071650; Tue, 22 Sep 2020 15:51:11 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:10 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Date: Tue, 22 Sep 2020 19:49:27 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Allow make_transient_cache_entry() to optionally receive a mem_pool struct in which it should allocate the entry. This will be used in the following patch, to store some transient entries which should persist until parallel checkout finishes. Signed-off-by: Matheus Tavares --- builtin/checkout--helper.c | 2 +- builtin/checkout.c | 2 +- builtin/difftool.c | 2 +- cache.h | 10 +++++----- read-cache.c | 12 ++++++++---- unpack-trees.c | 2 +- 6 files changed, 17 insertions(+), 13 deletions(-) diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c index 67fe37cf11..9646ed9eeb 100644 --- a/builtin/checkout--helper.c +++ b/builtin/checkout--helper.c @@ -38,7 +38,7 @@ static void packet_to_pc_item(char *line, int len, } memset(pc_item, 0, sizeof(*pc_item)); - pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len); + pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len, NULL); pc_item->ce->ce_namelen = fixed_portion->name_len; pc_item->ce->ce_mode = fixed_portion->ce_mode; memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen); diff --git a/builtin/checkout.c b/builtin/checkout.c index b18b9d6f3c..c0bf5e6711 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -291,7 +291,7 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid)) die(_("Unable to add merge result for '%s'"), path); free(result_buf.ptr); - ce = make_transient_cache_entry(mode, &oid, path, 2); + ce = make_transient_cache_entry(mode, &oid, path, 2, NULL); if (!ce) die(_("make_cache_entry failed for path '%s'"), path); status = checkout_entry(ce, state, NULL, nr_checkouts); diff --git a/builtin/difftool.c b/builtin/difftool.c index dfa22b67eb..5e7a57c8c2 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -323,7 +323,7 @@ static int checkout_path(unsigned mode, struct object_id *oid, struct cache_entry *ce; int ret; - ce = make_transient_cache_entry(mode, oid, path, 0); + ce = make_transient_cache_entry(mode, oid, path, 0, NULL); ret = checkout_entry(ce, state, NULL, NULL); discard_cache_entry(ce); diff --git a/cache.h b/cache.h index 17350cafa2..a394263f0e 100644 --- a/cache.h +++ b/cache.h @@ -355,16 +355,16 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate, size_t name_len); /* - * Create a cache_entry that is not intended to be added to an index. - * Caller is responsible for discarding the cache_entry - * with `discard_cache_entry`. + * Create a cache_entry that is not intended to be added to an index. If mp is + * not NULL, the entry is allocated within the given memory pool. Caller is + * responsible for discarding the cache_entry with `discard_cache_entry`. */ struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct object_id *oid, const char *path, - int stage); + int stage, struct mem_pool *mp); -struct cache_entry *make_empty_transient_cache_entry(size_t name_len); +struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp); /* * Discard cache entry. diff --git a/read-cache.c b/read-cache.c index ecf6f68994..f9bac760af 100644 --- a/read-cache.c +++ b/read-cache.c @@ -813,8 +813,10 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate, size_t le return mem_pool__ce_calloc(find_mem_pool(istate), len); } -struct cache_entry *make_empty_transient_cache_entry(size_t len) +struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp) { + if (mp) + return mem_pool__ce_calloc(mp, len); return xcalloc(1, cache_entry_size(len)); } @@ -848,8 +850,10 @@ struct cache_entry *make_cache_entry(struct index_state *istate, return ret; } -struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct object_id *oid, - const char *path, int stage) +struct cache_entry *make_transient_cache_entry(unsigned int mode, + const struct object_id *oid, + const char *path, int stage, + struct mem_pool *mp) { struct cache_entry *ce; int len; @@ -860,7 +864,7 @@ struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct o } len = strlen(path); - ce = make_empty_transient_cache_entry(len); + ce = make_empty_transient_cache_entry(len, mp); oidcpy(&ce->oid, oid); memcpy(ce->name, path, len); diff --git a/unpack-trees.c b/unpack-trees.c index e05e6ceff2..dcb40dc8fa 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1031,7 +1031,7 @@ static struct cache_entry *create_ce_entry(const struct traverse_info *info, size_t len = traverse_path_len(info, tree_entry_len(n)); struct cache_entry *ce = is_transient ? - make_empty_transient_cache_entry(len) : + make_empty_transient_cache_entry(len, NULL) : make_empty_cache_entry(istate, len); ce->ce_mode = create_ce_mode(n->mode); From patchwork Tue Sep 22 22:49:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793525 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2E9F4139A for ; Tue, 22 Sep 2020 22:51:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1422F20BED for ; Tue, 22 Sep 2020 22:51:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="YkJZ2gsy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726834AbgIVWvR (ORCPT ); Tue, 22 Sep 2020 18:51:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726617AbgIVWvQ (ORCPT ); Tue, 22 Sep 2020 18:51:16 -0400 Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com [IPv6:2607:f8b0:4864:20::841]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE286C061755 for ; Tue, 22 Sep 2020 15:51:16 -0700 (PDT) Received: by mail-qt1-x841.google.com with SMTP id 19so17091819qtp.1 for ; Tue, 22 Sep 2020 15:51:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GS4h2QNwoW0z+9ey/+u8HNPPBDVmzVwuGUsdvudWn3I=; b=YkJZ2gsyP/8wQdrh8FlOGebprhfJPdBHRSNCQyNBMXu3Ypxx7dSz+MeZ0DttzCvyNc nWQzOhI1uCuM5gIbxtjKlhyZTsL3r1/5NpQgF6VOD4JU9qC9/osCMyL/u7SAOBOHazMv lur6oKmb9vM0nkqLQA/CLviLFxMaD8vqqr98QXP68HzKQovxvmUBPb69b8AyZ6DXZNCy AYzoo3e81RX8ET0NhlbzUc00rYtsEAEHTmbiYd9PvsAZkLWDfSiqx2VEBqABqIjO89Lu zdkjqUt0iKBm4pfZ4SXBp7BA/rnnZIRwT8TBsflHPFZMkfTxKm1PmLGSZq+4An3+w/G9 8Q9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GS4h2QNwoW0z+9ey/+u8HNPPBDVmzVwuGUsdvudWn3I=; b=HCwlmuZnLKIbWMDdBjdsEDQBS4YYoJ0r/9iyuII6BULYcUFO7qrbp1SLQmRliN0yAH +TkUnII4JPh0grclVMt/GbZAuO9ccovvR6By0weWnSYEKLxMKFfXTnxlnZl1/JdoehR4 E5jHAlWUC3prZG74I7ZnCOKyw+3tU4BU70dpJU6EydQO39DtCpFQ+368j/eUJVsN6RCC FlUU8dYpRKJX3HC3M/VBGHa/oHWMdBX443VVPprYYRC1cgtuTsULeS2weLFFCqVHgj1z tU+NNAOXy/kQm30AYX44yLANEe1D6FAnX5UhTswed6KBQr4Ltk4j7mgipPVEyUwJYSij kalQ== X-Gm-Message-State: AOAM531n4vhZuMls8qo9wnlrU2J8pNu5BHSz4P7DnbkklyES/lIDoFfz 5tlnP3PyExrRlMsozkI+mb8jwolVlaPeHA== X-Google-Smtp-Source: ABdhPJwTxD4q+3riGVosu3t6LIX9OFW8pLY3R923Mz8oIbeHDqaSVD7FD3eWXhKAGqkk9hVkYHb1Bg== X-Received: by 2002:ac8:4e19:: with SMTP id c25mr7495541qtw.283.1600815075552; Tue, 22 Sep 2020 15:51:15 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:14 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 14/19] builtin/checkout.c: complete parallel checkout support Date: Tue, 22 Sep 2020 19:49:28 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org There is one code path in builtin/checkout.c which still doesn't benefit from parallel checkout because it calls checkout_entry() directly, instead of unpack_trees(). Let's add parallel support for this missing spot as well. Note: the transient cache entries allocated in checkout_merged() are now allocated in a mem_pool which is only discarded after parallel checkout finishes. This is done because the entries need to be valid when run_parallel_checkout() is called. Signed-off-by: Matheus Tavares --- builtin/checkout.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index c0bf5e6711..ddc4079b85 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -27,6 +27,7 @@ #include "wt-status.h" #include "xdiff-interface.h" #include "entry.h" +#include "parallel-checkout.h" static const char * const checkout_usage[] = { N_("git checkout [] "), @@ -230,7 +231,8 @@ static int checkout_stage(int stage, const struct cache_entry *ce, int pos, return error(_("path '%s' does not have their version"), ce->name); } -static int checkout_merged(int pos, const struct checkout *state, int *nr_checkouts) +static int checkout_merged(int pos, const struct checkout *state, + int *nr_checkouts, struct mem_pool *ce_mem_pool) { struct cache_entry *ce = active_cache[pos]; const char *path = ce->name; @@ -291,11 +293,10 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid)) die(_("Unable to add merge result for '%s'"), path); free(result_buf.ptr); - ce = make_transient_cache_entry(mode, &oid, path, 2, NULL); + ce = make_transient_cache_entry(mode, &oid, path, 2, ce_mem_pool); if (!ce) die(_("make_cache_entry failed for path '%s'"), path); status = checkout_entry(ce, state, NULL, nr_checkouts); - discard_cache_entry(ce); return status; } @@ -359,16 +360,22 @@ static int checkout_worktree(const struct checkout_opts *opts, int nr_checkouts = 0, nr_unmerged = 0; int errs = 0; int pos; + int pc_workers, pc_threshold; + struct mem_pool ce_mem_pool; state.force = 1; state.refresh_cache = 1; state.istate = &the_index; + mem_pool_init(&ce_mem_pool, 0); + get_parallel_checkout_configs(&pc_workers, &pc_threshold); init_checkout_metadata(&state.meta, info->refname, info->commit ? &info->commit->object.oid : &info->oid, NULL); enable_delayed_checkout(&state); + if (pc_workers > 1) + init_parallel_checkout(); for (pos = 0; pos < active_nr; pos++) { struct cache_entry *ce = active_cache[pos]; if (ce->ce_flags & CE_MATCHED) { @@ -384,10 +391,15 @@ static int checkout_worktree(const struct checkout_opts *opts, &nr_checkouts, opts->overlay_mode); else if (opts->merge) errs |= checkout_merged(pos, &state, - &nr_unmerged); + &nr_unmerged, + &ce_mem_pool); pos = skip_same_name(ce, pos) - 1; } } + if (pc_workers > 1) + errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, + NULL, NULL); + mem_pool_discard(&ce_mem_pool, should_validate_cache_entries()); remove_marked_cache_entries(&the_index, 1); remove_scheduled_dirs(); errs |= finish_delayed_checkout(&state, &nr_checkouts); From patchwork Tue Sep 22 22:49:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793527 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 42DDF16BC for ; Tue, 22 Sep 2020 22:51:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 244472071A for ; Tue, 22 Sep 2020 22:51:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="cORnHZfj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726839AbgIVWvV (ORCPT ); Tue, 22 Sep 2020 18:51:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726617AbgIVWvU (ORCPT ); Tue, 22 Sep 2020 18:51:20 -0400 Received: from mail-qt1-x842.google.com (mail-qt1-x842.google.com [IPv6:2607:f8b0:4864:20::842]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1413C061755 for ; Tue, 22 Sep 2020 15:51:20 -0700 (PDT) Received: by mail-qt1-x842.google.com with SMTP id e7so17038007qtj.11 for ; Tue, 22 Sep 2020 15:51:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lA2zZFhESSm+lGfZzZ5sKrw0x6AF5lYybK5HCYOPWxA=; b=cORnHZfji9pVfS/Zv90NmkFSkC/VismVhfHZBw2o//pd2fus+A651mnMpVbi+0hdOU 2i4ADFcRz/V00/8OS0eXu2T+KB3QUaqW0NgZNgIIzx8iVnZHDGXZpdYx2bHN6WKWvaR4 QRBxfBrvJLfHtjJ0HvDTQGcXH+eQRuwxYk38MxatOY+YF//avOfsOONJx8xRxI+0Maec q/C/yQwqDmjCSQ7JwTtwgnANW3hfVJWeHWEJCfC7vAxfSPeQ7rSD5wGgkoVqVOnP2YOW hHdsFf6SFqCJfR6B6xkj8d7pTn6kKe+HMzTUNKs5cBmviGggXNPB1++AvDOOTnWakUKx Gnkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lA2zZFhESSm+lGfZzZ5sKrw0x6AF5lYybK5HCYOPWxA=; b=X1cyvIi9IjwtPjcqg6itClTeqmolK+sg4yzEqAg3kKI2+U01sipagFPLXc7fDJS5wn 4VbPP6DmqpuF+00poM23MzKoxnLIlFNJzhFVdNj7AcvAT7/7vAglcGpMB1priSZdkR7x HxlCSrtYPIhMmBEx1moGi+CppuOp+r0l05PO0QiLs8domDS2bEFSIXgdb12TcHGPpplk 5pCIShurToseziHsulgvW0eeSAH603npyCHou4Y3viazAHPe7/Bhb2eC8vOgL3e1XyzG H030+Z2hgJno2w9iFbIluOt5SW42/wS8SkvShuEbARlBPW48FBo17vJFq1+i7zFNf9mE Cr8w== X-Gm-Message-State: AOAM531o3VfksiizmloAYAbGXB3ie4XLOdifPFUJATFHFwHRScTVCRE1 9P80v+yyuyfddv6m+5yri8rkb0jOIpAC0w== X-Google-Smtp-Source: ABdhPJx1sFJIx1dEjJ2tS3bLFos12F8ftPi90MGLEtTq0J0UKNHifjk+beKtR6jg0T/baKlNOQe0Ag== X-Received: by 2002:ac8:1c16:: with SMTP id a22mr6862811qtk.85.1600815078636; Tue, 22 Sep 2020 15:51:18 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:17 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 15/19] checkout-index: add parallel checkout support Date: Tue, 22 Sep 2020 19:49:29 -0300 Message-Id: <1cf9b807f780467a3837b5e13939ead7c67eaef5.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Signed-off-by: Matheus Tavares --- builtin/checkout-index.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c index 0f1ff73129..33fb933c30 100644 --- a/builtin/checkout-index.c +++ b/builtin/checkout-index.c @@ -12,6 +12,7 @@ #include "cache-tree.h" #include "parse-options.h" #include "entry.h" +#include "parallel-checkout.h" #define CHECKOUT_ALL 4 static int nul_term_line; @@ -160,6 +161,7 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix) int prefix_length; int force = 0, quiet = 0, not_new = 0; int index_opt = 0; + int pc_workers, pc_threshold; struct option builtin_checkout_index_options[] = { OPT_BOOL('a', "all", &all, N_("check out all files in the index")), @@ -214,6 +216,14 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix) hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR); } + if (!to_tempfile) + get_parallel_checkout_configs(&pc_workers, &pc_threshold); + else + pc_workers = 1; + + if (pc_workers > 1) + init_parallel_checkout(); + /* Check out named files first */ for (i = 0; i < argc; i++) { const char *arg = argv[i]; @@ -256,6 +266,12 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix) if (all) checkout_all(prefix, prefix_length); + if (pc_workers > 1) { + /* Errors were already reported */ + run_parallel_checkout(&state, pc_workers, pc_threshold, + NULL, NULL); + } + if (is_lock_file_locked(&lock_file) && write_locked_index(&the_index, &lock_file, COMMIT_LOCK)) die("Unable to write new index file"); From patchwork Tue Sep 22 22:49:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793529 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B88159D for ; Tue, 22 Sep 2020 22:51:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B13D2076E for ; Tue, 22 Sep 2020 22:51:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="KCr9V6UZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726844AbgIVWvX (ORCPT ); Tue, 22 Sep 2020 18:51:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726617AbgIVWvX (ORCPT ); Tue, 22 Sep 2020 18:51:23 -0400 Received: from mail-qt1-x843.google.com (mail-qt1-x843.google.com [IPv6:2607:f8b0:4864:20::843]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E573DC061755 for ; Tue, 22 Sep 2020 15:51:22 -0700 (PDT) Received: by mail-qt1-x843.google.com with SMTP id e7so17038077qtj.11 for ; Tue, 22 Sep 2020 15:51:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mA3fRuXoDIZ2h2bmfvApyLQ8HaIxgJIJJuh/fQoOM5M=; b=KCr9V6UZmIYKONk2RwfEXY3FQxfKD31fUl3evWsPWXNdsJSsfHVuY+TEu5220matQx WeSq5Nm7tnH2ltArei9JK5Qlr32AbDc1oEQuOnXGROk2TE2GeBpIaHJunaaPvbaB72Ro oR/Nmn3yr7dmn9pRAGQT/MsAMur2rV5ZzMcpvGBSldRaWlSLToKxnXgdpHsw7/NHZ8VV aZi8FjeY8l0XVDPi5EhiZAHpFXcFWV3lUyOZ4xZkHtTv9gqenfOhol9LYZo+5EN0EUeu fNrFht4uLqD+EolVWiUb7jYcl3UPEv4duEhoOWWHy7G7IooPixHJBxZO31ykOComcd/L BzQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mA3fRuXoDIZ2h2bmfvApyLQ8HaIxgJIJJuh/fQoOM5M=; b=r7CEVcwGvTfvDscQ7/jnBYUEtKwsF+DWf8gC7UKsG+82NHXOTaG7ZNtJoPzZ7lzqFi PTy9njEacNZCXFv8NOREJNy0gzH9Z33rxl44EkSO+qTPU4lwq6Li1niYUUtdJd5LHYJ8 vZCNghTpdz3E95sMW8ATnPMPEBff4n8eYXsOqWajGCwLvbDRKY8dcUBUR0BaA2woKECe YTxqlQ3UgId5hxC19+SGhLjoN7liBF49z++81pgHRDmV0x9wZVh/uibmzKjlEnkFreKq bcGrPWYHwVxAyMRjoev8dkZmkCxDwezLhA3XFBP8e8xk6usSbBNcZZOoR4T3081RjVh+ HU/Q== X-Gm-Message-State: AOAM533oKlTC7AwdRgSQfD5+yBet6Ff04aTnvGYagsVWGaTpaN7/rTxb Jw8gSzbzASwhyPutG0NgDaoFOUKCb4GUlg== X-Google-Smtp-Source: ABdhPJxHEHnUULwk+QkOsoIGR1EIgTqx4Q4RMHKSgDGvMwzONvkGLwxBiMEpNVhCg2jO/h80P5jXUw== X-Received: by 2002:ac8:435e:: with SMTP id a30mr7102122qtn.201.1600815081649; Tue, 22 Sep 2020 15:51:21 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:20 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 16/19] parallel-checkout: add tests for basic operations Date: Tue, 22 Sep 2020 19:49:30 -0300 Message-Id: <64b41d537e68a45f2bb0a0c3078f2cd314b5a57d.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add tests to populate the working tree during clone and checkout using the sequential and parallel modes, to confirm that they produce identical results. Also test basic checkout mechanics, such as checking for symlinks in the leading directories and the abidance to --force. Note: some helper functions are added to a common lib file which is only included by t2080 for now. But it will also be used by another parallel-checkout test in a following patch. Original-patch-by: Jeff Hostetler Signed-off-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- t/lib-parallel-checkout.sh | 39 ++++++ t/t2080-parallel-checkout-basics.sh | 197 ++++++++++++++++++++++++++++ 2 files changed, 236 insertions(+) create mode 100644 t/lib-parallel-checkout.sh create mode 100755 t/t2080-parallel-checkout-basics.sh diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh new file mode 100644 index 0000000000..c95ca27711 --- /dev/null +++ b/t/lib-parallel-checkout.sh @@ -0,0 +1,39 @@ +# Helpers for t208* tests + +# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}` +# and checks that the number of workers spawned is equal to $3. +git_pc() +{ + if test $# -lt 4 + then + BUG "too few arguments to git_pc()" + fi + + workers=$1 threshold=$2 expected_workers=$3 && + shift && shift && shift && + + rm -f trace && + GIT_TRACE2="$(pwd)/trace" git \ + -c checkout.workers=$workers \ + -c checkout.thresholdForParallelism=$threshold \ + -c advice.detachedHead=0 \ + $@ && + + # Check that the expected number of workers has been used. Note that it + # can be different than the requested number in two cases: when the + # quantity of entries to be checked out is less than the number of + # workers; and when the threshold has not been reached. + # + local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) && + test $workers_in_trace -eq $expected_workers && + rm -f trace +} + +# Verify that both the working tree and the index were created correctly +verify_checkout() +{ + git -C $1 diff-index --quiet HEAD -- && + git -C $1 diff-index --quiet --cached HEAD -- && + git -C $1 status --porcelain >$1.status && + test_must_be_empty $1.status +} diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh new file mode 100755 index 0000000000..c088a06ecc --- /dev/null +++ b/t/t2080-parallel-checkout-basics.sh @@ -0,0 +1,197 @@ +#!/bin/sh + +test_description='parallel-checkout basics + +Ensure that parallel-checkout basically works on clone and checkout, spawning +the required number of workers and correctly populating both the index and +working tree. +' + +TEST_NO_CREATE_REPO=1 +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" + +# NEEDSWORK: cloning a SHA1 repo with GIT_TEST_DEFAULT_HASH set to "sha256" +# currently produces a wrong result (See +# https://lore.kernel.org/git/20200911151717.43475-1-matheus.bernardino@usp.br/). +# So we skip the "parallel-checkout during clone" tests when this test flag is +# set to "sha256". Remove this when the bug is fixed. +# +if test "$GIT_TEST_DEFAULT_HASH" = "sha256" +then + skip_all="t2080 currently don't work with GIT_TEST_DEFAULT_HASH=sha256" + test_done +fi + +R_BASE=$GIT_BUILD_DIR + +test_expect_success 'sequential clone' ' + git_pc 1 0 0 clone --quiet -- $R_BASE r_sequential && + verify_checkout r_sequential +' + +test_expect_success 'parallel clone' ' + git_pc 2 0 2 clone --quiet -- $R_BASE r_parallel && + verify_checkout r_parallel +' + +test_expect_success 'fallback to sequential clone (threshold)' ' + git -C $R_BASE ls-files >files && + nr_files=$(wc -l a && + mkdir e && + echo e/x >e/x && + ln -s e b && + git add -A && + git commit -m B1 && + + git checkout -b B2 && + echo modified >a && + rm -rf e && + rm b && + mkdir b && + echo b/f >b/f && + ln -s b e && + git init d && + test_commit -C d f && + git submodule add ./d && + git add -A && + git commit -m B2 && + + git checkout --recurse-submodules B1 + ) +' + +test_expect_success SYMLINKS 'sequential checkout' ' + cp -R various various_sequential && + git_pc 1 0 0 -C various_sequential checkout --recurse-submodules B2 && + verify_checkout various_sequential +' + +test_expect_success SYMLINKS 'parallel checkout' ' + cp -R various various_parallel && + git_pc 2 0 2 -C various_parallel checkout --recurse-submodules B2 && + verify_checkout various_parallel +' + +test_expect_success SYMLINKS 'fallback to sequential checkout (threshold)' ' + cp -R various various_sequential_fallback && + git_pc 2 100 0 -C various_sequential_fallback checkout --recurse-submodules B2 && + verify_checkout various_sequential_fallback +' + +test_expect_success SYMLINKS 'compare working trees from checkouts' ' + rm -rf various_sequential/.git && + rm -rf various_parallel/.git && + rm -rf various_sequential_fallback/.git && + diff -qr various_sequential various_parallel && + diff -qr various_sequential various_sequential_fallback +' + +test_cmp_str() +{ + echo "$1" >tmp && + test_cmp tmp "$2" +} + +test_expect_success 'parallel checkout respects --[no]-force' ' + git init dirty && + ( + cd dirty && + mkdir D && + test_commit D/F && + test_commit F && + + echo changed >F.t && + rm -rf D && + echo changed >D && + + # We expect 0 workers because there is nothing to be updated + git_pc 2 0 0 checkout HEAD && + test_path_is_file D && + test_cmp_str changed D && + test_cmp_str changed F.t && + + git_pc 2 0 2 checkout --force HEAD && + test_path_is_dir D && + test_cmp_str D/F D/F.t && + test_cmp_str F F.t + ) +' + +test_expect_success SYMLINKS 'parallel checkout checks for symlinks in leading dirs' ' + git init symlinks && + ( + cd symlinks && + mkdir D E && + + # Create two entries in D to have enough work for 2 parallel + # workers + test_commit D/A && + test_commit D/B && + test_commit E/C && + rm -rf D && + ln -s E D && + + git_pc 2 0 2 checkout --force HEAD && + ! test -L D && + test_cmp_str D/A D/A.t && + test_cmp_str D/B D/B.t + ) +' + +test_expect_success SYMLINKS,CASE_INSENSITIVE_FS 'symlink colliding with leading dir' ' + git init colliding-symlink && + ( + cd colliding-symlink && + file_hex=$(git hash-object -w --stdin tree && + printf "100644 E/B\0${file_oct}" >>tree && + printf "120000 e\0${sym_oct}" >>tree && + + tree_hex=$(git hash-object -w -t tree --stdin X-Patchwork-Id: 11793531 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 82DEA59D for ; Tue, 22 Sep 2020 22:51:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6425920BED for ; Tue, 22 Sep 2020 22:51:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="RiTup3yW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726847AbgIVWv0 (ORCPT ); Tue, 22 Sep 2020 18:51:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726548AbgIVWv0 (ORCPT ); Tue, 22 Sep 2020 18:51:26 -0400 Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com [IPv6:2607:f8b0:4864:20::841]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DC8AC061755 for ; Tue, 22 Sep 2020 15:51:26 -0700 (PDT) Received: by mail-qt1-x841.google.com with SMTP id n10so17091314qtv.3 for ; Tue, 22 Sep 2020 15:51:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Ouj7gAWcoUbIXmBibCkI6L4lTEIRs9mmTaNaQx0X8Nk=; b=RiTup3yWJ4IBimKYGMmLHYGTGSDEieChPvyjoA0ZQo5xDQJc7AZe3R/CLsK475BxXs L2QmsvyCC0n2W2vPMrwfx/KDfLzPS9uIxdXDJS7XGHx6Go08ltu5N3a86CtHjlEDGDoN 5UFw8TZ5DBKyApXBDUDqgJ4mj4ndXfUEReAbLpODsyf+hCxcjYRP/BLcM3JMbb9gAXaC hbijNItOMiiq4TRdPd0DlfQkogZj2dlc7icgZrmf3CSovgrwFSKkLtrFF5qL1nUt7Cpx XXgKwc4UZ5aWDbo5Vv0TKdpiyHf0rVyphpeL2mMYl1Lt+OMyDndNZMNEzWsBxjOor36q PPCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ouj7gAWcoUbIXmBibCkI6L4lTEIRs9mmTaNaQx0X8Nk=; b=GU/yHT32ferrIdIWsXwZCPiuUyLe7zrzFB5GlBaaGYxUbZNJGIfChIsXYqXkX8/AL+ 1VISuqgHntDmcY7Fo5Kiw/5WQ8JUAKxLbtHM1bsDGGNjBzwT1nS9JrAeajl5NQNUOejF E8pDYfohJqlj253iBlw7DVgmtoT4fDARMGHwjNZvGVKNkI3oqwc7s2d1i0qZ1IkGKn0V ZIh1ssPclZMwQth9IUDlcfAuMSTkMs5CxsEq1XWb+DmzgZ1w9XTvMW5xJPIQPzEStfkQ hISCkhnGW9bxiGRM0ylUrVJpAW+8UHuBO1zUvEUwVRlCRc54BZNDwHjxeaaSYgaE+Abh ilLA== X-Gm-Message-State: AOAM530W70JM+PSnkOA6b3nv5VF4d93UPtgHDHUz7TG95QyfH1HG4Tmz gRjJjaRVVGUmjnYwsKwVEAkgkgormZYPGg== X-Google-Smtp-Source: ABdhPJxFSJZUN1tzcUiXF22ATxXaUPhxr+0qrQVzmVdo3w7BXQ8AzSsOjELEjKQdHAzrlSb0zVffHg== X-Received: by 2002:ac8:7650:: with SMTP id i16mr7319303qtr.268.1600815084949; Tue, 22 Sep 2020 15:51:24 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:24 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 17/19] parallel-checkout: add tests related to clone collisions Date: Tue, 22 Sep 2020 19:49:31 -0300 Message-Id: <70708d3e31b49f55b1eae6077d5386bb63ce617d.1600814153.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add tests to confirm that path collisions are properly reported during a clone operation using parallel-checkout. Original-patch-by: Jeff Hostetler Signed-off-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- t/t2081-parallel-checkout-collisions.sh | 115 ++++++++++++++++++++++++ 1 file changed, 115 insertions(+) create mode 100755 t/t2081-parallel-checkout-collisions.sh diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh new file mode 100755 index 0000000000..3ce195b892 --- /dev/null +++ b/t/t2081-parallel-checkout-collisions.sh @@ -0,0 +1,115 @@ +#!/bin/sh + +test_description='parallel-checkout collisions' + +. ./test-lib.sh + +# When there are pathname collisions during a clone, Git should report a warning +# listing all of the colliding entries. The sequential code detects a collision +# by calling lstat() before trying to open(O_CREAT) the file. Then, to find the +# colliding pair of an item k, it searches cache_entry[0, k-1]. +# +# This is not sufficient in parallel-checkout mode since colliding files may be +# created in a racy order. The tests in this file make sure the collision +# detection code is extended for parallel-checkout. This is done in two parts: +# +# - First, two parallel workers create four colliding files racily. +# - Then this exercise is repeated but forcing the colliding pair to appear in +# the second half of the cache_entry's array. +# +# The second item uses the fact that files with clean/smudge filters are not +# parallel-eligible; and that they are processed sequentially *before* any +# worker is spawned. We set a filter attribute to the last entry in the +# cache_entry[] array, making it non-eligible, so that it is populated first. +# This way, we can test if the collision detection code is correctly looking +# for collision pairs in the second half of the array. + +test_expect_success CASE_INSENSITIVE_FS 'setup' ' + file_hex=$(git hash-object -w --stdin tree && + printf "100644 FILE_x\0${file_oct}" >>tree && + printf "100644 file_X\0${file_oct}" >>tree && + printf "100644 file_x\0${file_oct}" >>tree && + printf "100644 .gitattributes\0${attr_oct}" >>tree && + + tree_hex=$(git hash-object -w -t tree --stdin >filter.log + EOF +' + +clone_and_check_collision() +{ + id=$1 workers=$2 threshold=$3 expected_workers=$4 filter=$5 && + + filter_opts= + if test "$filter" -eq "use_filter" + then + # We use `core.ignoreCase=0` so that only `file_x` + # matches the pattern in .gitattributes. + # + filter_opts='-c filter.logger.smudge="../logger_script %f" -c core.ignoreCase=0' + fi && + + test_path_is_missing $id.trace && + GIT_TRACE2="$(pwd)/$id.trace" git \ + -c checkout.workers=$workers \ + -c checkout.thresholdForParallelism=$threshold \ + $filter_opts clone --branch=collisions -- . r_$id 2>$id.warning && + + # Check that checkout spawned the right number of workers + workers_in_trace=$(grep "child_start\[.\] git checkout--helper" $id.trace | wc -l) && + test $workers_in_trace -eq $expected_workers && + + if test $filter -eq "use_filter" + then + # Make sure only 'file_x' was filtered + test_path_is_file r_$id/filter.log && + echo file_x >expected.filter.log && + test_cmp r_$id/filter.log expected.filter.log + else + test_path_is_missing r_$id/filter.log + fi && + + grep FILE_X $id.warning && + grep FILE_x $id.warning && + grep file_X $id.warning && + grep file_x $id.warning && + test_i18ngrep "the following paths have collided" $id.warning +} + +test_expect_success CASE_INSENSITIVE_FS 'collision detection on parallel clone' ' + clone_and_check_collision parallel 2 0 2 +' + +test_expect_success CASE_INSENSITIVE_FS 'collision detection on fallback to sequential clone' ' + git ls-tree --name-only -r collisions >files && + nr_files=$(wc -l files && + nr_files=$(wc -l X-Patchwork-Id: 11793535 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A53F139A for ; Tue, 22 Sep 2020 22:51:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F1AA9221EF for ; Tue, 22 Sep 2020 22:51:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="xUosciuP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726850AbgIVWva (ORCPT ); Tue, 22 Sep 2020 18:51:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726548AbgIVWv3 (ORCPT ); Tue, 22 Sep 2020 18:51:29 -0400 Received: from mail-qv1-xf35.google.com (mail-qv1-xf35.google.com [IPv6:2607:f8b0:4864:20::f35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C13BFC061755 for ; Tue, 22 Sep 2020 15:51:29 -0700 (PDT) Received: by mail-qv1-xf35.google.com with SMTP id cv8so10397092qvb.12 for ; Tue, 22 Sep 2020 15:51:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5lbaPV17AcHOGfhR/neKqPzAHtfoFiFxXZYHRMTvLD8=; b=xUosciuPlujDWEFigqR53E/jwc6Uaqz1J4ALSpi/7dbI2osvR7GVbThtZepElsOLIX /CV1pbnSQiG6THWCdF5IL7Q1zNrXU0SjsJlnHy9qQPRAtEg32ECDdqFepEZ3WdlzMHMU Wbv9x4OFweH4UNkTTEK9hSXGCMtR1nRDjuco9UgO/IKzYlNqp1GoCmzJxkGtQ+LE8q4s eiE4uS4zw5jeFsidCas62vDyZoZZi7d/P9EAVOpaHC65COcmOxEF7UHzlTgRCQErv+Uc JN8V9KaA6l9ZabJreLcKKKh6z4aXyIyqLWUAPnAjHl9VQWez4KtzN8RelLkLXdwp0egq /Qjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5lbaPV17AcHOGfhR/neKqPzAHtfoFiFxXZYHRMTvLD8=; b=WkCFOiuhDTkjyuvjyTpGTbJN/vQPhWUowrHzCl6gJ+6domp3EgHiK0wJNYBrPgB5Gp iFwBgrROQsoZkaAbHFlGYcWh8FIR51LAjzSx8Ts4e/wBwb9gGV1fMBIK3naFc0XouQi5 kBzvXMQh7TZM25HrlsXGOZR+TcQHn4oYCLwwihsyv8o2lCSTr0l7kQTehoM3YAQmRUR+ qT5E4jvNsKWhGKRK4sij+4FESzYmZ6uN7PhBj67+2rkv8zI2KKdS0BeICOMbH4KzfYlD pIBxZC2PofW8IAniDt6knk431JJjJJPhOTzXW7s4QafF/vtuLWAMpZ1HwoLuaiJVzDLx dPhA== X-Gm-Message-State: AOAM530dhV0eR/pbhgMybTHRcd+IN4t5jvBUGx/fSrimBm+K2hYpgXR1 vCyOSpxXq/2Hxb5/r7QptAhIjMCsC1hEcw== X-Google-Smtp-Source: ABdhPJzTR+g4P4FhkItqHoyFsy6IMbz+ZORkzPXYJj2Uy2f/xDxhIdPoUw/cTjY3kgpA4qFW/oeqhw== X-Received: by 2002:ad4:518c:: with SMTP id b12mr8595588qvp.38.1600815088218; Tue, 22 Sep 2020 15:51:28 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:27 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 18/19] parallel-checkout: add tests related to .gitattributes Date: Tue, 22 Sep 2020 19:49:32 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add tests to confirm that `struct conv_attrs` data is correctly passed from the main process to the workers, and that they properly smudge files before writing to the working tree. Also check that non-parallel-eligible entries, such as regular files that require external filters, are correctly smudge and written when parallel-checkout is enabled. Note: to avoid repeating code, some helper functions are extracted from t0028 into a common lib file. Original-patch-by: Jeff Hostetler Signed-off-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- t/lib-encoding.sh | 25 ++++ t/t0028-working-tree-encoding.sh | 25 +--- t/t2082-parallel-checkout-attributes.sh | 174 ++++++++++++++++++++++++ 3 files changed, 200 insertions(+), 24 deletions(-) create mode 100644 t/lib-encoding.sh create mode 100755 t/t2082-parallel-checkout-attributes.sh diff --git a/t/lib-encoding.sh b/t/lib-encoding.sh new file mode 100644 index 0000000000..c52ffbbed5 --- /dev/null +++ b/t/lib-encoding.sh @@ -0,0 +1,25 @@ +# Encoding helpers used by t0028 and t2082 + +test_lazy_prereq NO_UTF16_BOM ' + test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6 +' + +test_lazy_prereq NO_UTF32_BOM ' + test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12 +' + +write_utf16 () { + if test_have_prereq NO_UTF16_BOM + then + printf '\376\377' + fi && + iconv -f UTF-8 -t UTF-16 +} + +write_utf32 () { + if test_have_prereq NO_UTF32_BOM + then + printf '\0\0\376\377' + fi && + iconv -f UTF-8 -t UTF-32 +} diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh index bfc4fb9af5..4fffc3a639 100755 --- a/t/t0028-working-tree-encoding.sh +++ b/t/t0028-working-tree-encoding.sh @@ -3,33 +3,10 @@ test_description='working-tree-encoding conversion via gitattributes' . ./test-lib.sh +. "$TEST_DIRECTORY/lib-encoding.sh" GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING -test_lazy_prereq NO_UTF16_BOM ' - test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6 -' - -test_lazy_prereq NO_UTF32_BOM ' - test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12 -' - -write_utf16 () { - if test_have_prereq NO_UTF16_BOM - then - printf '\376\377' - fi && - iconv -f UTF-8 -t UTF-16 -} - -write_utf32 () { - if test_have_prereq NO_UTF32_BOM - then - printf '\0\0\376\377' - fi && - iconv -f UTF-8 -t UTF-32 -} - test_expect_success 'setup test files' ' git config core.eol lf && diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh new file mode 100755 index 0000000000..6800574588 --- /dev/null +++ b/t/t2082-parallel-checkout-attributes.sh @@ -0,0 +1,174 @@ +#!/bin/sh + +test_description='parallel-checkout: attributes + +Verify that parallel-checkout correctly creates files that require +conversions, as specified in .gitattributes. The main point here is +to check that the conv_attr data is correctly sent to the workers +and that it contains sufficient information to smudge files +properly (without access to the index or attribute stack). +' + +TEST_NO_CREATE_REPO=1 +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" +. "$TEST_DIRECTORY/lib-encoding.sh" + +test_expect_success 'parallel-checkout with ident' ' + git init ident && + ( + cd ident && + echo "A ident" >.gitattributes && + echo "\$Id\$" >A && + echo "\$Id\$" >B && + git add -A && + git commit -m id && + + rm A B && + git_pc 2 0 2 reset --hard && + hexsz=$(test_oid hexsz) && + grep -E "\\\$Id: [0-9a-f]{$hexsz} \\\$" A && + grep "\\\$Id\\\$" B + ) +' + +test_expect_success 'parallel-checkout with re-encoding' ' + git init encoding && + ( + cd encoding && + echo text >utf8-text && + cat utf8-text | write_utf16 >utf16-text && + + echo "A working-tree-encoding=UTF-16" >.gitattributes && + cp utf16-text A && + cp utf16-text B && + git add A B .gitattributes && + git commit -m encoding && + + # Check that A (and only A) is stored in UTF-8 + git cat-file -p :A >A.internal && + test_cmp_bin utf8-text A.internal && + git cat-file -p :B >B.internal && + test_cmp_bin utf16-text B.internal && + + # Check that A is re-encoded during checkout + rm A B && + git_pc 2 0 2 checkout A B && + test_cmp_bin utf16-text A + ) +' + +test_expect_success 'parallel-checkout with eol conversions' ' + git init eol && + ( + cd eol && + git config core.autocrlf false && + printf "multi\r\nline\r\ntext" >crlf-text && + printf "multi\nline\ntext" >lf-text && + + echo "A text eol=crlf" >.gitattributes && + echo "B -text" >>.gitattributes && + cp crlf-text A && + cp crlf-text B && + git add A B .gitattributes && + git commit -m eol && + + # Check that A (and only A) is stored with LF format + git cat-file -p :A >A.internal && + test_cmp_bin lf-text A.internal && + git cat-file -p :B >B.internal && + test_cmp_bin crlf-text B.internal && + + # Check that A is converted to CRLF during checkout + rm A B && + git_pc 2 0 2 checkout A B && + test_cmp_bin crlf-text A + ) +' + +test_cmp_str() +{ + echo "$1" >tmp && + test_cmp tmp "$2" +} + +# Entries that require an external filter are not eligible for parallel +# checkout. Check that both the parallel-eligible and non-eligible entries are +# properly writen in a single checkout process. +# +test_expect_success 'parallel-checkout and external filter' ' + git init filter && + ( + cd filter && + git config filter.x2y.clean "tr x y" && + git config filter.x2y.smudge "tr y x" && + git config filter.x2y.required true && + + echo "A filter=x2y" >.gitattributes && + echo x >A && + echo x >B && + echo x >C && + git add -A && + git commit -m filter && + + # Check that A (and only A) was cleaned + git cat-file -p :A >A.internal && + test_cmp_str y A.internal && + git cat-file -p :B >B.internal && + test_cmp_str x B.internal && + git cat-file -p :C >C.internal && + test_cmp_str x C.internal && + + rm A B C *.internal && + git_pc 2 0 2 checkout A B C && + test_cmp_str x A && + test_cmp_str x B && + test_cmp_str x C + ) +' + +# The delayed queue is independent from the parallel queue, and they should be +# able to work together in the same checkout process. +# +test_expect_success PERL 'parallel-checkout and delayed checkout' ' + write_script rot13-filter.pl "$PERL_PATH" \ + <"$TEST_DIRECTORY"/t0021/rot13-filter.pl && + test_config_global filter.delay.process \ + "\"$(pwd)/rot13-filter.pl\" \"$(pwd)/delayed.log\" clean smudge delay" && + test_config_global filter.delay.required true && + + echo "a b c" >delay-content && + echo "n o p" >delay-rot13-content && + + git init delayed && + ( + cd delayed && + echo "*.a filter=delay" >.gitattributes && + cp ../delay-content test-delay10.a && + cp ../delay-content test-delay11.a && + echo parallel >parallel1.b && + echo parallel >parallel2.b && + git add -A && + git commit -m delayed && + + # Check that the stored data was cleaned + git cat-file -p :test-delay10.a > delay10.internal && + test_cmp delay10.internal ../delay-rot13-content && + git cat-file -p :test-delay11.a > delay11.internal && + test_cmp delay11.internal ../delay-rot13-content && + rm *.internal && + + rm *.a *.b + ) && + + git_pc 2 0 2 -C delayed checkout -f && + verify_checkout delayed && + + # Check that the *.a files got to the delay queue and were filtered + grep "smudge test-delay10.a .* \[DELAYED\]" delayed.log && + grep "smudge test-delay11.a .* \[DELAYED\]" delayed.log && + test_cmp delayed/test-delay10.a delay-content && + test_cmp delayed/test-delay11.a delay-content +' + +test_done From patchwork Tue Sep 22 22:49:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11793537 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1C6C259D for ; Tue, 22 Sep 2020 22:51:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F34B1221EF for ; Tue, 22 Sep 2020 22:51:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="OkWVhJjR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726853AbgIVWvd (ORCPT ); Tue, 22 Sep 2020 18:51:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726548AbgIVWvc (ORCPT ); Tue, 22 Sep 2020 18:51:32 -0400 Received: from mail-qv1-xf43.google.com (mail-qv1-xf43.google.com [IPv6:2607:f8b0:4864:20::f43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF1BCC061755 for ; Tue, 22 Sep 2020 15:51:32 -0700 (PDT) Received: by mail-qv1-xf43.google.com with SMTP id cr8so10413226qvb.10 for ; Tue, 22 Sep 2020 15:51:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cOvQ52Ig5ZRc5vUR0VIpNJbijlT/WkZ8/uPa+mqYwwY=; b=OkWVhJjRaiuCGETZU4+Ng6hfKn4/mtBT/OZNP9N9a6iZQm72U/Nx2oZons1SgEChAX 7jcRexLIbPDXXw+VGBF2vexMIrVT0c+/B9odYYSGiANVfY6k2gSv7hsD1uC11+/l3Ywx mVChbhuhsE7QNEpwxLW7xcwGuojHBsS7wiDn8FMSWWUEU03E04naphDFfSGLNm7hDU92 bsYM8vr4iJ7wqBbSacdVJLZNJs2qi5IYG4hIJCX95D4DQX8Tcgd2YciPuolF4VaQrkHG wcLBu9KasWdMhVtS6Ogx7QzPPjsy16677U96Lwg1opMdoQhKedpsyixSDaP/bBrazWk7 6k7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cOvQ52Ig5ZRc5vUR0VIpNJbijlT/WkZ8/uPa+mqYwwY=; b=SpQSVdox72G5GNYw1ZZPg4naZI1cnX81z4Cm/NNw8NHRZlqiZLBGT1mpE79ZgfPlP5 UrZ5X+18MeMboaH3fc7ny8M1r/RIk0krmRR1cQTPYPMx661YVtSNaqvBGSoXLaQFMaj1 NXGTPmGqaBk0INnw47fED/Gy7J3rToo9PMoFRqHxDdvvHoXBiXA5h+sHh5HQX6mzTSAi NaqMQF7+Ogm+v+6APUzGQsA4NMi0TiqVTROe/Xsl1DZRIf4giKeuCYqgRVGBBV+XGLlm ps8y0ywnYvQYY0VWhcCercpw0SqFdt80PfQQkIxFmPGqEG6oVytqrHkNSPTdi9EWQfY6 +gfA== X-Gm-Message-State: AOAM5338FRkEWAm4TKKeYhrErIWScDNkPDc3Qyy5jH+skbwDdyKYKeL4 +tjs5oUrt6mHDmaomCoYiZC6X6VdOhU+8w== X-Google-Smtp-Source: ABdhPJxr29NkgJHdk9NRpzYB4z2QzxU5fxU/HuL8oHiKIsFC+8vORcyXuKmb1U6fVeWx5YEZMQBXTg== X-Received: by 2002:a05:6214:1181:: with SMTP id t1mr8631711qvv.11.1600815091558; Tue, 22 Sep 2020 15:51:31 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.51.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:51:30 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 19/19] ci: run test round with parallel-checkout enabled Date: Tue, 22 Sep 2020 19:49:33 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We already have tests for the basic parallel-checkout operations. But this code can also run in other commands, such as git-read-tree and git-sparse-checkout, which are currently not tested with multiple workers. To promote a wider test coverage without duplicating tests: 1. Add the GIT_TEST_CHECKOUT_WORKERS environment variable, to optionally force parallel-checkout execution during the whole test suite. 2. Include this variable in the second test round of the linux-gcc job of our ci scripts. This round runs `make test` again with some optional GIT_TEST_* variables enabled, so there is no additional overhead in exercising the parallel-checkout code here. Note: the specific parallel-checkout tests t208* cannot be used in combination with GIT_TEST_CHECKOUT_WORKERS as they need to set and check the number of workers by themselves. So skip those tests when this flag is set. Signed-off-by: Matheus Tavares --- ci/run-build-and-tests.sh | 1 + parallel-checkout.c | 14 ++++++++++++++ t/README | 4 ++++ t/lib-parallel-checkout.sh | 6 ++++++ t/t2081-parallel-checkout-collisions.sh | 1 + 5 files changed, 26 insertions(+) diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 6c27b886b8..aa32ddc361 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -22,6 +22,7 @@ linux-gcc) export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 export GIT_TEST_MULTI_PACK_INDEX=1 export GIT_TEST_ADD_I_USE_BUILTIN=1 + export GIT_TEST_CHECKOUT_WORKERS=2 make test ;; linux-clang) diff --git a/parallel-checkout.c b/parallel-checkout.c index 5156b14c53..94b44d2a48 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -32,6 +32,20 @@ enum pc_status parallel_checkout_status(void) void get_parallel_checkout_configs(int *num_workers, int *threshold) { + char *env_workers = getenv("GIT_TEST_CHECKOUT_WORKERS"); + + if (env_workers && *env_workers) { + if (strtol_i(env_workers, 10, num_workers)) { + die("invalid value for GIT_TEST_CHECKOUT_WORKERS: '%s'", + env_workers); + } + if (*num_workers < 1) + *num_workers = online_cpus(); + + *threshold = 0; + return; + } + if (git_config_get_int("checkout.workers", num_workers)) *num_workers = 1; else if (*num_workers < 1) diff --git a/t/README b/t/README index 2adaf7c2d2..cd1b15c55a 100644 --- a/t/README +++ b/t/README @@ -425,6 +425,10 @@ GIT_TEST_DEFAULT_HASH= specifies which hash algorithm to use in the test scripts. Recognized values for are "sha1" and "sha256". +GIT_TEST_CHECKOUT_WORKERS= overrides the 'checkout.workers' setting +to and 'checkout.thresholdForParallelism' to 0, forcing the +execution of the parallel-checkout code. + Naming Tests ------------ diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh index c95ca27711..80bb0a0900 100644 --- a/t/lib-parallel-checkout.sh +++ b/t/lib-parallel-checkout.sh @@ -1,5 +1,11 @@ # Helpers for t208* tests +if ! test -z "$GIT_TEST_CHECKOUT_WORKERS" +then + skip_all="skipping test, GIT_TEST_CHECKOUT_WORKERS is set" + test_done +fi + # Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}` # and checks that the number of workers spawned is equal to $3. git_pc() diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh index 3ce195b892..5dbff54bfb 100755 --- a/t/t2081-parallel-checkout-collisions.sh +++ b/t/t2081-parallel-checkout-collisions.sh @@ -3,6 +3,7 @@ test_description='parallel-checkout collisions' . ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" # When there are pathname collisions during a clone, Git should report a warning # listing all of the colliding entries. The sequential code detects a collision