From patchwork Wed Nov 14 00:25:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681727 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6A1C53CF1 for ; Wed, 14 Nov 2018 00:26:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B4292B684 for ; Wed, 14 Nov 2018 00:26:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 471782B686; Wed, 14 Nov 2018 00:26:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC0F72B662 for ; Wed, 14 Nov 2018 00:26:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731136AbeKNK1C (ORCPT ); Wed, 14 Nov 2018 05:27:02 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38448 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728686AbeKNK1C (ORCPT ); Wed, 14 Nov 2018 05:27:02 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0Ib3k024224; Tue, 13 Nov 2018 16:26:03 -0800 Received: from mail.palantir.com ([198.97.14.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:03 -0800 Received: from dc-prod-exch-01.YOJOE.local (10.193.18.14) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 19:26:01 -0500 Received: from smtp-transport.yojoe.local (10.129.56.124) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 19:26:01 -0500 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 042512212287; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 01/11] git-fast-import.txt: fix documentation for --quiet option Date: Tue, 13 Nov 2018 16:25:50 -0800 Message-ID: <20181114002600.29233-2-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=43 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=602 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Elijah Newren --- Documentation/git-fast-import.txt | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index e81117d27f..7ab97745a6 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -40,9 +40,10 @@ OPTIONS not contain the old commit). --quiet:: - Disable all non-fatal output, making fast-import silent when it - is successful. This option disables the output shown by - --stats. + Disable the output shown by --stats, making fast-import usually + be silent when it is successful. However, if the import stream + has directives intended to show user output (e.g. `progress` + directives), the corresponding messages will still be shown. --stats:: Display some basic statistics about the objects fast-import has From patchwork Wed Nov 14 00:25:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681745 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A7FA018F0 for ; Wed, 14 Nov 2018 00:26:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 95E50290E6 for ; Wed, 14 Nov 2018 00:26:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 89F1F29508; Wed, 14 Nov 2018 00:26:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 43251290E6 for ; Wed, 14 Nov 2018 00:26:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731126AbeKNK1C (ORCPT ); Wed, 14 Nov 2018 05:27:02 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38440 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726823AbeKNK1B (ORCPT ); Wed, 14 Nov 2018 05:27:01 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0IbQu024221; Tue, 13 Nov 2018 16:26:03 -0800 Received: from mail.palantir.com ([8.4.231.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051r-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:03 -0800 Received: from sj-prod-exch-02.YOJOE.local (10.129.18.29) by sj-prod-exch-02.YOJOE.local (10.129.18.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 16:25:55 -0800 Received: from smtp-transport.yojoe.local (10.129.56.124) by sj-prod-exch-02.YOJOE.local (10.129.18.29) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 16:25:55 -0800 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 0AD8F2212288; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 02/11] git-fast-export.txt: clarify misleading documentation about rev-list args Date: Tue, 13 Nov 2018 16:25:51 -0800 Message-ID: <20181114002600.29233-3-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=43 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=402 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Elijah Newren --- Documentation/git-fast-export.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt index ce954be532..fda55b3284 100644 --- a/Documentation/git-fast-export.txt +++ b/Documentation/git-fast-export.txt @@ -119,7 +119,8 @@ marks the same across runs. 'git rev-list', that specifies the specific objects and references to export. For example, `master~10..master` causes the current master reference to be exported along with all objects - added since its 10th ancestor commit. + added since its 10th ancestor commit and all files common to + master{tilde}9 and master{tilde}10. EXAMPLES -------- From patchwork Wed Nov 14 00:25:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681743 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 352BB139B for ; Wed, 14 Nov 2018 00:26:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26553290E6 for ; Wed, 14 Nov 2018 00:26:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1A2BC29508; Wed, 14 Nov 2018 00:26:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A8BA7291D2 for ; Wed, 14 Nov 2018 00:26:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731490AbeKNK1O (ORCPT ); Wed, 14 Nov 2018 05:27:14 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:52792 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731063AbeKNK1C (ORCPT ); Wed, 14 Nov 2018 05:27:02 -0500 Received: from pps.filterd (m0096528.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0HiRs006748; Tue, 13 Nov 2018 16:26:03 -0800 Received: from mail.palantir.com ([8.4.231.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7bw058u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:03 -0800 Received: from sj-prod-exch-02.YOJOE.local (10.129.18.29) by sj-prod-exch-01.YOJOE.local (10.129.18.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 16:26:00 -0800 Received: from smtp-transport.yojoe.local (10.129.56.124) by sj-prod-exch-02.YOJOE.local (10.129.18.29) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 16:25:55 -0800 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 113712212289; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 03/11] fast-export: use value from correct enum Date: Tue, 13 Nov 2018 16:25:52 -0800 Message-ID: <20181114002600.29233-4-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=479 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP ABORT and ERROR happen to have the same value, but come from differnt enums. Use the one from the correct enum, and while at it, rename the values to avoid such problems. Signed-off-by: Elijah Newren --- builtin/fast-export.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 456797c12a..af724e9937 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -31,8 +31,8 @@ static const char *fast_export_usage[] = { }; static int progress; -static enum { ABORT, VERBATIM, WARN, WARN_STRIP, STRIP } signed_tag_mode = ABORT; -static enum { ERROR, DROP, REWRITE } tag_of_filtered_mode = ERROR; +static enum { SIGNED_TAG_ABORT, VERBATIM, WARN, WARN_STRIP, STRIP } signed_tag_mode = SIGNED_TAG_ABORT; +static enum { TAG_FILTERING_ABORT, DROP, REWRITE } tag_of_filtered_mode = TAG_FILTERING_ABORT; static int fake_missing_tagger; static int use_done_feature; static int no_data; @@ -46,7 +46,7 @@ static int parse_opt_signed_tag_mode(const struct option *opt, const char *arg, int unset) { if (unset || !strcmp(arg, "abort")) - signed_tag_mode = ABORT; + signed_tag_mode = SIGNED_TAG_ABORT; else if (!strcmp(arg, "verbatim") || !strcmp(arg, "ignore")) signed_tag_mode = VERBATIM; else if (!strcmp(arg, "warn")) @@ -64,7 +64,7 @@ static int parse_opt_tag_of_filtered_mode(const struct option *opt, const char *arg, int unset) { if (unset || !strcmp(arg, "abort")) - tag_of_filtered_mode = ERROR; + tag_of_filtered_mode = TAG_FILTERING_ABORT; else if (!strcmp(arg, "drop")) tag_of_filtered_mode = DROP; else if (!strcmp(arg, "rewrite")) @@ -727,7 +727,7 @@ static void handle_tag(const char *name, struct tag *tag) "\n-----BEGIN PGP SIGNATURE-----\n"); if (signature) switch(signed_tag_mode) { - case ABORT: + case SIGNED_TAG_ABORT: die("encountered signed tag %s; use " "--signed-tags= to handle it", oid_to_hex(&tag->object.oid)); @@ -752,7 +752,7 @@ static void handle_tag(const char *name, struct tag *tag) tagged_mark = get_object_mark(tagged); if (!tagged_mark) { switch(tag_of_filtered_mode) { - case ABORT: + case TAG_FILTERING_ABORT: die("tag %s tags unexported object; use " "--tag-of-filtered-object= to handle it", oid_to_hex(&tag->object.oid)); From patchwork Wed Nov 14 00:25:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681735 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0137618F0 for ; Wed, 14 Nov 2018 00:26:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E466A2B662 for ; Wed, 14 Nov 2018 00:26:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D8A682B684; Wed, 14 Nov 2018 00:26:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 76B6A2B662 for ; Wed, 14 Nov 2018 00:26:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731410AbeKNK1H (ORCPT ); Wed, 14 Nov 2018 05:27:07 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38476 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727203AbeKNK1E (ORCPT ); Wed, 14 Nov 2018 05:27:04 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0Ib3l024224; Tue, 13 Nov 2018 16:26:04 -0800 Received: from mail.palantir.com ([198.97.14.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051s-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:04 -0800 Received: from dc-prod-exch-01.YOJOE.local (10.193.18.14) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 19:26:02 -0500 Received: from smtp-transport.yojoe.local (10.129.56.124) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 19:26:02 -0500 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 1806F221228A; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 04/11] fast-export: avoid dying when filtering by paths and old tags exist Date: Tue, 13 Nov 2018 16:25:53 -0800 Message-ID: <20181114002600.29233-5-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=15 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If --tag-of-filtered-object=rewrite is specified along with a set of paths to limit what is exported, then any tags pointing to old commits that do not contain any of those specified paths cause problems. Since the old tagged commit is not exported, fast-export attempts to rewrite such tags to an ancestor commit which was exported. If no such commit exists, then fast-export currently die()s. Five years after the tag rewriting logic was added to fast-export (see commit 2d8ad4691921, "fast-export: Add a --tag-of-filtered-object option for newly dangling tags", 2009-06-25), fast-import gained the ability to delete refs (see commit 4ee1b225b99f, "fast-import: add support to delete refs", 2014-04-20), so now we do have a valid option to rewrite the tag to. Delete these tags instead of dying. Signed-off-by: Elijah Newren --- builtin/fast-export.c | 9 ++++++--- t/t9350-fast-export.sh | 16 ++++++++++++++++ 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/builtin/fast-export.c b/builtin/fast-export.c index af724e9937..b984a44224 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -774,9 +774,12 @@ static void handle_tag(const char *name, struct tag *tag) break; if (!(p->object.flags & TREESAME)) break; - if (!p->parents) - die("can't find replacement commit for tag %s", - oid_to_hex(&tag->object.oid)); + if (!p->parents) { + printf("reset %s\nfrom %s\n\n", + name, sha1_to_hex(null_sha1)); + free(buf); + return; + } p = p->parents->item; } tagged_mark = get_object_mark(&p->object); diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 6a392e87bc..3400ebeb51 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -325,6 +325,22 @@ test_expect_success 'rewriting tag of filtered out object' ' ) ' +test_expect_success 'rewrite tag predating pathspecs to nothing' ' + test_create_repo rewrite_tag_predating_pathspecs && + ( + cd rewrite_tag_predating_pathspecs && + + test_commit initial && + + git tag -a -m "Some old tag" v0.0.0.0.0.0.1 && + + test_commit bar && + + git fast-export --tag-of-filtered-object=rewrite --all -- bar.t >output && + grep from.$ZERO_OID output + ) +' + cat > limit-by-paths/expected << EOF blob mark :1 From patchwork Wed Nov 14 00:25:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681731 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6BBB918F0 for ; Wed, 14 Nov 2018 00:26:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5D6BE291D2 for ; Wed, 14 Nov 2018 00:26:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 517E92B679; Wed, 14 Nov 2018 00:26:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D3D922B5F2 for ; Wed, 14 Nov 2018 00:26:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731300AbeKNK1E (ORCPT ); Wed, 14 Nov 2018 05:27:04 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:52794 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731048AbeKNK1C (ORCPT ); Wed, 14 Nov 2018 05:27:02 -0500 Received: from pps.filterd (m0096528.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0HiRt006748; Tue, 13 Nov 2018 16:26:03 -0800 Received: from mail.palantir.com ([8.4.231.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7bw058u-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:03 -0800 Received: from sj-prod-exch-02.YOJOE.local (10.129.18.29) by sj-prod-exch-01.YOJOE.local (10.129.18.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 16:26:00 -0800 Received: from smtp-transport.yojoe.local (10.129.56.124) by sj-prod-exch-02.YOJOE.local (10.129.18.29) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 16:25:56 -0800 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 1F06B221228B; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 05/11] fast-export: move commit rewriting logic into a function for reuse Date: Tue, 13 Nov 2018 16:25:54 -0800 Message-ID: <20181114002600.29233-6-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=13 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Logic to replace a filtered commit with an unfiltered ancestor is useful elsewhere; put it into a function we can call. Signed-off-by: Elijah Newren --- builtin/fast-export.c | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/builtin/fast-export.c b/builtin/fast-export.c index b984a44224..7888fc98b5 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -187,6 +187,22 @@ static int get_object_mark(struct object *object) return ptr_to_mark(decoration); } +static struct commit *rewrite_commit(struct commit *p) +{ + for (;;) { + if (p->parents && p->parents->next) + break; + if (p->object.flags & UNINTERESTING) + break; + if (!(p->object.flags & TREESAME)) + break; + if (!p->parents) + return NULL; + p = p->parents->item; + } + return p; +} + static void show_progress(void) { static int counter = 0; @@ -766,21 +782,12 @@ static void handle_tag(const char *name, struct tag *tag) oid_to_hex(&tag->object.oid), type_name(tagged->type)); } - p = (struct commit *)tagged; - for (;;) { - if (p->parents && p->parents->next) - break; - if (p->object.flags & UNINTERESTING) - break; - if (!(p->object.flags & TREESAME)) - break; - if (!p->parents) { - printf("reset %s\nfrom %s\n\n", - name, sha1_to_hex(null_sha1)); - free(buf); - return; - } - p = p->parents->item; + p = rewrite_commit((struct commit *)tagged); + if (!p) { + printf("reset %s\nfrom %s\n\n", + name, sha1_to_hex(null_sha1)); + free(buf); + return; } tagged_mark = get_object_mark(&p->object); } From patchwork Wed Nov 14 00:25:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681737 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5942C18F0 for ; Wed, 14 Nov 2018 00:26:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 49BE329508 for ; Wed, 14 Nov 2018 00:26:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3E0122B644; Wed, 14 Nov 2018 00:26:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D1B0329508 for ; Wed, 14 Nov 2018 00:26:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731335AbeKNK1G (ORCPT ); Wed, 14 Nov 2018 05:27:06 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38474 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731199AbeKNK1F (ORCPT ); Wed, 14 Nov 2018 05:27:05 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0Ib3m024224; Tue, 13 Nov 2018 16:26:04 -0800 Received: from mail.palantir.com ([198.97.14.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051s-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:04 -0800 Received: from dc-prod-exch-01.YOJOE.local (10.193.18.14) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 19:26:02 -0500 Received: from smtp-transport.yojoe.local (10.129.56.124) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 19:26:02 -0500 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 25FEF221228C; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 06/11] fast-export: when using paths, avoid corrupt stream with non-existent mark Date: Tue, 13 Nov 2018 16:25:55 -0800 Message-ID: <20181114002600.29233-7-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=893 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If file paths are specified to fast-export and multiple refs point to a commit that does not touch any of the relevant file paths, then fast-export can hit problems. fast-export has a list of additional refs that it needs to explicitly set after exporting all blobs and commits, and when it tries to get_object_mark() on the relevant commit, it can get a mark of 0, i.e. "not found", because the commit in question did not touch the relevant paths and thus was not exported. Trying to import a stream with a mark corresponding to an unexported object will cause fast-import to crash. Avoid this problem by taking the commit the ref points to and finding an ancestor of it that was exported, and make the ref point to that commit instead. Signed-off-by: Elijah Newren --- builtin/fast-export.c | 13 ++++++++++++- t/t9350-fast-export.sh | 20 ++++++++++++++++++++ 2 files changed, 32 insertions(+), 1 deletion(-) diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 7888fc98b5..2eafe351ea 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -900,7 +900,18 @@ static void handle_tags_and_duplicates(void) if (anonymize) name = anonymize_refname(name); /* create refs pointing to already seen commits */ - commit = (struct commit *)object; + commit = rewrite_commit((struct commit *)object); + if (!commit) { + /* + * Neither this object nor any of its + * ancestors touch any relevant paths, so + * it has been filtered to nothing. Delete + * it. + */ + printf("reset %s\nfrom %s\n\n", + name, sha1_to_hex(null_sha1)); + continue; + } printf("reset %s\nfrom :%d\n\n", name, get_object_mark(&commit->object)); show_progress(); diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 3400ebeb51..299120ba70 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -382,6 +382,26 @@ test_expect_success 'path limiting with import-marks does not lose unmodified fi grep file0 actual ' +test_expect_success 'avoid corrupt stream with non-existent mark' ' + test_create_repo avoid_non_existent_mark && + ( + cd avoid_non_existent_mark && + + test_commit important-path && + + test_commit ignored && + + git branch A && + git branch B && + + echo foo >>important-path.t && + git add important-path.t && + test_commit more changes && + + git fast-export --all -- important-path.t | git fast-import --force + ) +' + test_expect_success 'full-tree re-shows unmodified files' ' git checkout -f simple && git fast-export --full-tree simple >actual && From patchwork Wed Nov 14 00:25:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681723 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C936E13B5 for ; Wed, 14 Nov 2018 00:26:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7B2D2B662 for ; Wed, 14 Nov 2018 00:26:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AC40B2B695; Wed, 14 Nov 2018 00:26:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7436A2B662 for ; Wed, 14 Nov 2018 00:26:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727074AbeKNK1A (ORCPT ); Wed, 14 Nov 2018 05:27:00 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38422 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726823AbeKNK1A (ORCPT ); Wed, 14 Nov 2018 05:27:00 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0Ib3o024224; Tue, 13 Nov 2018 16:26:05 -0800 Received: from mail.palantir.com ([198.97.14.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051s-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:05 -0800 Received: from dc-prod-exch-01.YOJOE.local (10.193.18.14) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 19:26:02 -0500 Received: from smtp-transport.yojoe.local (10.129.56.124) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 19:26:02 -0500 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 2CE59221228D; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 07/11] fast-export: ensure we export requested refs Date: Tue, 13 Nov 2018 16:25:56 -0800 Message-ID: <20181114002600.29233-8-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If file paths are specified to fast-export and a ref points to a commit that does not touch any of the relevant paths, then that ref would sometimes fail to be exported. (This depends on whether any ancestors of the commit which do touch the relevant paths would be exported with that same ref name or a different ref name.) To avoid this problem, put *all* specified refs into extra_refs to start, and then as we export each commit, remove the refname used in the 'commit $REFNAME' directive from extra_refs. Then, in handle_tags_and_duplicates() we know which refs actually do need a manual reset directive in order to be included. This means that we do need some special handling for excluded refs; e.g. if someone runs git fast-export ^master master then they've asked for master to be exported, but they have also asked for the commit which master points to and all of its history to be excluded. That logically means ref deletion. Previously, such refs were just silently omitted from being exported despite having been explicitly requested for export. Signed-off-by: Elijah Newren --- builtin/fast-export.c | 54 ++++++++++++++++++++++++++++++++---------- t/t9350-fast-export.sh | 16 ++++++++++--- 2 files changed, 55 insertions(+), 15 deletions(-) diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 2eafe351ea..2fef00436b 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -38,6 +38,7 @@ static int use_done_feature; static int no_data; static int full_tree; static struct string_list extra_refs = STRING_LIST_INIT_NODUP; +static struct string_list tag_refs = STRING_LIST_INIT_NODUP; static struct refspec refspecs = REFSPEC_INIT_FETCH; static int anonymize; static struct revision_sources revision_sources; @@ -611,6 +612,13 @@ static void handle_commit(struct commit *commit, struct rev_info *rev, export_blob(&diff_queued_diff.queue[i]->two->oid); refname = *revision_sources_at(&revision_sources, commit); + /* + * FIXME: string_list_remove() below for each ref is overall + * O(N^2). Compared to a history walk and diffing trees, this is + * just lost in the noise in practice. However, theoretically a + * repo may have enough refs for this to become slow. + */ + string_list_remove(&extra_refs, refname, 0); if (anonymize) { refname = anonymize_refname(refname); anonymize_ident_line(&committer, &committer_end); @@ -814,7 +822,7 @@ static struct commit *get_commit(struct rev_cmdline_entry *e, char *full_name) /* handle nested tags */ while (tag && tag->object.type == OBJ_TAG) { parse_object(the_repository, &tag->object.oid); - string_list_append(&extra_refs, full_name)->util = tag; + string_list_append(&tag_refs, full_name)->util = tag; tag = (struct tag *)tag->tagged; } if (!tag) @@ -873,25 +881,30 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info) } /* - * This ref will not be updated through a commit, lets make - * sure it gets properly updated eventually. + * Make sure this ref gets properly updated eventually, whether + * through a commit or manually at the end. */ - if (*revision_sources_at(&revision_sources, commit) || - commit->object.flags & SHOWN) + if (e->item->type != OBJ_TAG) string_list_append(&extra_refs, full_name)->util = commit; + if (!*revision_sources_at(&revision_sources, commit)) *revision_sources_at(&revision_sources, commit) = full_name; } + + string_list_sort(&extra_refs); + string_list_remove_duplicates(&extra_refs, 0); } -static void handle_tags_and_duplicates(void) +static void handle_tags_and_duplicates(struct string_list *extras) { struct commit *commit; int i; - for (i = extra_refs.nr - 1; i >= 0; i--) { - const char *name = extra_refs.items[i].string; - struct object *object = extra_refs.items[i].util; + for (i = extras->nr - 1; i >= 0; i--) { + const char *name = extras->items[i].string; + struct object *object = extras->items[i].util; + int mark; + switch (object->type) { case OBJ_TAG: handle_tag(name, (struct tag *)object); @@ -912,8 +925,24 @@ static void handle_tags_and_duplicates(void) name, sha1_to_hex(null_sha1)); continue; } - printf("reset %s\nfrom :%d\n\n", name, - get_object_mark(&commit->object)); + + mark = get_object_mark(&commit->object); + if (!mark) { + /* + * Getting here means we have a commit which + * was excluded by a negative refspec (e.g. + * fast-export ^master master). If the user + * wants the branch exported but every commit + * in its history to be deleted, that sounds + * like a ref deletion to me. + */ + printf("reset %s\nfrom %s\n\n", + name, sha1_to_hex(null_sha1)); + continue; + } + + printf("reset %s\nfrom :%d\n\n", name, mark + ); show_progress(); break; } @@ -1101,7 +1130,8 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix) } } - handle_tags_and_duplicates(); + handle_tags_and_duplicates(&extra_refs); + handle_tags_and_duplicates(&tag_refs); handle_deletes(); if (export_filename && lastimportid != last_idnum) diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 299120ba70..50c2fceef4 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -544,10 +544,20 @@ test_expect_success 'use refspec' ' test_cmp expected actual ' -test_expect_success 'delete refspec' ' +test_expect_success 'delete ref because entire history excluded' ' git branch to-delete && - git fast-export --refspec :refs/heads/to-delete to-delete ^to-delete > actual && - cat > expected <<-EOF && + git fast-export to-delete ^to-delete >actual && + cat >expected <<-EOF && + reset refs/heads/to-delete + from 0000000000000000000000000000000000000000 + + EOF + test_cmp expected actual +' + +test_expect_success 'delete refspec' ' + git fast-export --refspec :refs/heads/to-delete >actual && + cat >expected <<-EOF && reset refs/heads/to-delete from 0000000000000000000000000000000000000000 From patchwork Wed Nov 14 00:25:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681741 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 273EE139B for ; Wed, 14 Nov 2018 00:26:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 14C1529508 for ; Wed, 14 Nov 2018 00:26:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0728629562; Wed, 14 Nov 2018 00:26:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5958329508 for ; Wed, 14 Nov 2018 00:26:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731471AbeKNK1M (ORCPT ); Wed, 14 Nov 2018 05:27:12 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38464 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731033AbeKNK1D (ORCPT ); Wed, 14 Nov 2018 05:27:03 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0Ib3n024224; Tue, 13 Nov 2018 16:26:05 -0800 Received: from mail.palantir.com ([198.97.14.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051s-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:05 -0800 Received: from dc-prod-exch-01.YOJOE.local (10.193.18.14) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 19:26:02 -0500 Received: from smtp-transport.yojoe.local (10.129.56.124) by dc-prod-exch-01.YOJOE.local (10.193.18.14) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 19:26:02 -0500 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 349D8221228E; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 08/11] fast-export: add --reference-excluded-parents option Date: Tue, 13 Nov 2018 16:25:57 -0800 Message-ID: <20181114002600.29233-9-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP git filter-branch has a nifty feature allowing you to rewrite, e.g. just the last 8 commits of a linear history git filter-branch $OPTIONS HEAD~8..HEAD If you try the same with git fast-export, you instead get a history of only 8 commits, with HEAD~7 being rewritten into a root commit. There are two alternatives: 1) Don't use the negative revision specification, and when you're filtering the output to make modifications to the last 8 commits, just be careful to not modify any earlier commits somehow. 2) First run 'git fast-export --export-marks=somefile HEAD~8', then run 'git fast-export --import-marks=somefile HEAD~8..HEAD'. Both are more error prone than I'd like (the first for obvious reasons; with the second option I have sometimes accidentally included too many revisions in the first command and then found that the corresponding extra revisions were not exported by the second command and thus were not modified as I expected). Also, both are poor from a performance perspective. Add a new --reference-excluded-parents option which will cause fast-export to refer to commits outside the specified rev-list-args range by their sha1sum. Such a stream will only be useful in a repository which already contains the necessary commits (much like the restriction imposed when using --no-data). Note from Peff: I think we might be able to do a little more optimization here. If we're exporting HEAD^..HEAD and there's an object in HEAD^ which is unchanged in HEAD, I think we'd still print it (because it would not be marked SHOWN), but we could omit it (by walking the tree of the boundary commits and marking them shown). I don't think it's a blocker for what you're doing here, but just a possible future optimization. Signed-off-by: Elijah Newren --- Documentation/git-fast-export.txt | 17 +++++++++++-- builtin/fast-export.c | 42 +++++++++++++++++++++++-------- t/t9350-fast-export.sh | 11 ++++++++ 3 files changed, 58 insertions(+), 12 deletions(-) diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt index fda55b3284..f65026662a 100644 --- a/Documentation/git-fast-export.txt +++ b/Documentation/git-fast-export.txt @@ -110,6 +110,18 @@ marks the same across runs. the shape of the history and stored tree. See the section on `ANONYMIZING` below. +--reference-excluded-parents:: + By default, running a command such as `git fast-export + master~5..master` will not include the commit master{tilde}5 + and will make master{tilde}4 no longer have master{tilde}5 as + a parent (though both the old master{tilde}4 and new + master{tilde}4 will have all the same files). Use + --reference-excluded-parents to instead have the the stream + refer to commits in the excluded range of history by their + sha1sum. Note that the resulting stream can only be used by a + repository which already contains the necessary parent + commits. + --refspec:: Apply the specified refspec to each ref exported. Multiple of them can be specified. @@ -119,8 +131,9 @@ marks the same across runs. 'git rev-list', that specifies the specific objects and references to export. For example, `master~10..master` causes the current master reference to be exported along with all objects - added since its 10th ancestor commit and all files common to - master{tilde}9 and master{tilde}10. + added since its 10th ancestor commit and (unless the + --reference-excluded-parents option is specified) all files + common to master{tilde}9 and master{tilde}10. EXAMPLES -------- diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 2fef00436b..3cc98c31ad 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -37,6 +37,7 @@ static int fake_missing_tagger; static int use_done_feature; static int no_data; static int full_tree; +static int reference_excluded_commits; static struct string_list extra_refs = STRING_LIST_INIT_NODUP; static struct string_list tag_refs = STRING_LIST_INIT_NODUP; static struct refspec refspecs = REFSPEC_INIT_FETCH; @@ -596,7 +597,8 @@ static void handle_commit(struct commit *commit, struct rev_info *rev, message += 2; if (commit->parents && - get_object_mark(&commit->parents->item->object) != 0 && + (get_object_mark(&commit->parents->item->object) != 0 || + reference_excluded_commits) && !full_tree) { parse_commit_or_die(commit->parents->item); diff_tree_oid(get_commit_tree_oid(commit->parents->item), @@ -644,13 +646,21 @@ static void handle_commit(struct commit *commit, struct rev_info *rev, unuse_commit_buffer(commit, commit_buffer); for (i = 0, p = commit->parents; p; p = p->next) { - int mark = get_object_mark(&p->item->object); - if (!mark) + struct object *obj = &p->item->object; + int mark = get_object_mark(obj); + + if (!mark && !reference_excluded_commits) continue; if (i == 0) - printf("from :%d\n", mark); + printf("from "); + else + printf("merge "); + if (mark) + printf(":%d\n", mark); else - printf("merge :%d\n", mark); + printf("%s\n", sha1_to_hex(anonymize ? + anonymize_sha1(&obj->oid) : + obj->oid.hash)); i++; } @@ -931,13 +941,22 @@ static void handle_tags_and_duplicates(struct string_list *extras) /* * Getting here means we have a commit which * was excluded by a negative refspec (e.g. - * fast-export ^master master). If the user + * fast-export ^master master). If we are + * referencing excluded commits, set the ref + * to the exact commit. Otherwise, the user * wants the branch exported but every commit - * in its history to be deleted, that sounds - * like a ref deletion to me. + * in its history to be deleted, which basically + * just means deletion of the ref. */ - printf("reset %s\nfrom %s\n\n", - name, sha1_to_hex(null_sha1)); + if (!reference_excluded_commits) { + /* delete the ref */ + printf("reset %s\nfrom %s\n\n", + name, sha1_to_hex(null_sha1)); + continue; + } + /* set ref to commit using oid, not mark */ + printf("reset %s\nfrom %s\n\n", name, + sha1_to_hex(commit->object.oid.hash)); continue; } @@ -1074,6 +1093,9 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix) OPT_STRING_LIST(0, "refspec", &refspecs_list, N_("refspec"), N_("Apply refspec to exported refs")), OPT_BOOL(0, "anonymize", &anonymize, N_("anonymize output")), + OPT_BOOL(0, "reference-excluded-parents", + &reference_excluded_commits, N_("Reference parents which are not in fast-export stream by sha1sum")), + OPT_END() }; diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 50c2fceef4..d7d73061d0 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -66,6 +66,17 @@ test_expect_success 'fast-export master~2..master' ' ' +test_expect_success 'fast-export --reference-excluded-parents master~2..master' ' + + git fast-export --reference-excluded-parents master~2..master >actual && + grep commit.refs/heads/master actual >commit-count && + test_line_count = 2 commit-count && + sed "s/master/rewrite/" actual | + (cd new && + git fast-import && + test $MASTER = $(git rev-parse --verify refs/heads/rewrite)) +' + test_expect_success 'iso-8859-1' ' git config i18n.commitencoding ISO8859-1 && From patchwork Wed Nov 14 00:25:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681733 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 823B613B5 for ; Wed, 14 Nov 2018 00:26:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 734B329562 for ; Wed, 14 Nov 2018 00:26:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 67A7B2B684; Wed, 14 Nov 2018 00:26:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D5E0829562 for ; Wed, 14 Nov 2018 00:26:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731378AbeKNK1H (ORCPT ); Wed, 14 Nov 2018 05:27:07 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38478 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731245AbeKNK1E (ORCPT ); Wed, 14 Nov 2018 05:27:04 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0IbQv024221; Tue, 13 Nov 2018 16:26:03 -0800 Received: from mail.palantir.com ([8.4.231.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051r-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:03 -0800 Received: from sj-prod-exch-01.YOJOE.local (10.129.18.26) by sj-prod-exch-02.YOJOE.local (10.129.18.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 16:25:56 -0800 Received: from smtp-transport.yojoe.local (10.129.56.124) by sj-prod-exch-01.YOJOE.local (10.129.18.26) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 16:26:01 -0800 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 3C837221228F; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 09/11] fast-import: remove unmaintained duplicate documentation Date: Tue, 13 Nov 2018 16:25:58 -0800 Message-ID: <20181114002600.29233-10-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP fast-import.c has started with a comment for nine and a half years re-directing the reader to Documentation/git-fast-import.txt for maintained documentation. Instead of leaving the unmaintained documentation in place, just excise it. Signed-off-by: Elijah Newren --- fast-import.c | 154 -------------------------------------------------- 1 file changed, 154 deletions(-) diff --git a/fast-import.c b/fast-import.c index 95600c78e0..555d49ad23 100644 --- a/fast-import.c +++ b/fast-import.c @@ -1,157 +1,3 @@ -/* -(See Documentation/git-fast-import.txt for maintained documentation.) -Format of STDIN stream: - - stream ::= cmd*; - - cmd ::= new_blob - | new_commit - | new_tag - | reset_branch - | checkpoint - | progress - ; - - new_blob ::= 'blob' lf - mark? - file_content; - file_content ::= data; - - new_commit ::= 'commit' sp ref_str lf - mark? - ('author' (sp name)? sp '<' email '>' sp when lf)? - 'committer' (sp name)? sp '<' email '>' sp when lf - commit_msg - ('from' sp commit-ish lf)? - ('merge' sp commit-ish lf)* - (file_change | ls)* - lf?; - commit_msg ::= data; - - ls ::= 'ls' sp '"' quoted(path) '"' lf; - - file_change ::= file_clr - | file_del - | file_rnm - | file_cpy - | file_obm - | file_inm; - file_clr ::= 'deleteall' lf; - file_del ::= 'D' sp path_str lf; - file_rnm ::= 'R' sp path_str sp path_str lf; - file_cpy ::= 'C' sp path_str sp path_str lf; - file_obm ::= 'M' sp mode sp (hexsha1 | idnum) sp path_str lf; - file_inm ::= 'M' sp mode sp 'inline' sp path_str lf - data; - note_obm ::= 'N' sp (hexsha1 | idnum) sp commit-ish lf; - note_inm ::= 'N' sp 'inline' sp commit-ish lf - data; - - new_tag ::= 'tag' sp tag_str lf - 'from' sp commit-ish lf - ('tagger' (sp name)? sp '<' email '>' sp when lf)? - tag_msg; - tag_msg ::= data; - - reset_branch ::= 'reset' sp ref_str lf - ('from' sp commit-ish lf)? - lf?; - - checkpoint ::= 'checkpoint' lf - lf?; - - progress ::= 'progress' sp not_lf* lf - lf?; - - # note: the first idnum in a stream should be 1 and subsequent - # idnums should not have gaps between values as this will cause - # the stream parser to reserve space for the gapped values. An - # idnum can be updated in the future to a new object by issuing - # a new mark directive with the old idnum. - # - mark ::= 'mark' sp idnum lf; - data ::= (delimited_data | exact_data) - lf?; - - # note: delim may be any string but must not contain lf. - # data_line may contain any data but must not be exactly - # delim. - delimited_data ::= 'data' sp '<<' delim lf - (data_line lf)* - delim lf; - - # note: declen indicates the length of binary_data in bytes. - # declen does not include the lf preceding the binary data. - # - exact_data ::= 'data' sp declen lf - binary_data; - - # note: quoted strings are C-style quoting supporting \c for - # common escapes of 'c' (e..g \n, \t, \\, \") or \nnn where nnn - # is the signed byte value in octal. Note that the only - # characters which must actually be escaped to protect the - # stream formatting is: \, " and LF. Otherwise these values - # are UTF8. - # - commit-ish ::= (ref_str | hexsha1 | sha1exp_str | idnum); - ref_str ::= ref; - sha1exp_str ::= sha1exp; - tag_str ::= tag; - path_str ::= path | '"' quoted(path) '"' ; - mode ::= '100644' | '644' - | '100755' | '755' - | '120000' - ; - - declen ::= # unsigned 32 bit value, ascii base10 notation; - bigint ::= # unsigned integer value, ascii base10 notation; - binary_data ::= # file content, not interpreted; - - when ::= raw_when | rfc2822_when; - raw_when ::= ts sp tz; - rfc2822_when ::= # Valid RFC 2822 date and time; - - sp ::= # ASCII space character; - lf ::= # ASCII newline (LF) character; - - # note: a colon (':') must precede the numerical value assigned to - # an idnum. This is to distinguish it from a ref or tag name as - # GIT does not permit ':' in ref or tag strings. - # - idnum ::= ':' bigint; - path ::= # GIT style file path, e.g. "a/b/c"; - ref ::= # GIT ref name, e.g. "refs/heads/MOZ_GECKO_EXPERIMENT"; - tag ::= # GIT tag name, e.g. "FIREFOX_1_5"; - sha1exp ::= # Any valid GIT SHA1 expression; - hexsha1 ::= # SHA1 in hexadecimal format; - - # note: name and email are UTF8 strings, however name must not - # contain '<' or lf and email must not contain any of the - # following: '<', '>', lf. - # - name ::= # valid GIT author/committer name; - email ::= # valid GIT author/committer email; - ts ::= # time since the epoch in seconds, ascii base10 notation; - tz ::= # GIT style timezone; - - # note: comments, get-mark, ls-tree, and cat-blob requests may - # appear anywhere in the input, except within a data command. Any - # form of the data command always escapes the related input from - # comment processing. - # - # In case it is not clear, the '#' that starts the comment - # must be the first character on that line (an lf - # preceded it). - # - - get_mark ::= 'get-mark' sp idnum lf; - cat_blob ::= 'cat-blob' sp (hexsha1 | idnum) lf; - ls_tree ::= 'ls' sp (hexsha1 | idnum) sp path_str lf; - - comment ::= '#' not_lf* lf; - not_lf ::= # Any byte that is not ASCII newline (LF); -*/ - #include "builtin.h" #include "cache.h" #include "repository.h" From patchwork Wed Nov 14 00:25:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681729 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 49AC5139B for ; Wed, 14 Nov 2018 00:26:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3907D2B662 for ; Wed, 14 Nov 2018 00:26:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2DC3B2B6BB; Wed, 14 Nov 2018 00:26:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 65E142B679 for ; Wed, 14 Nov 2018 00:26:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731272AbeKNK1E (ORCPT ); Wed, 14 Nov 2018 05:27:04 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38454 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731083AbeKNK1C (ORCPT ); Wed, 14 Nov 2018 05:27:02 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0IbQw024221; Tue, 13 Nov 2018 16:26:03 -0800 Received: from mail.palantir.com ([8.4.231.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051r-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:03 -0800 Received: from sj-prod-exch-01.YOJOE.local (10.129.18.26) by sj-prod-exch-02.YOJOE.local (10.129.18.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 16:25:56 -0800 Received: from smtp-transport.yojoe.local (10.129.56.124) by sj-prod-exch-01.YOJOE.local (10.129.18.26) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 16:26:01 -0800 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 448BE2212291; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 10/11] fast-export: add a --show-original-ids option to show original names Date: Tue, 13 Nov 2018 16:25:59 -0800 Message-ID: <20181114002600.29233-11-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Knowing the original names (hashes) of commits can sometimes enable post-filtering that would otherwise be difficult or impossible. In particular, the desire to rewrite commit messages which refer to other prior commits (on top of whatever other filtering is being done) is very difficult without knowing the original names of each commit. In addition, knowing the original names (hashes) of blobs can allow filtering by blob-id without requiring re-hashing the content of the blob, and is thus useful as a small optimization. Once we add original ids for both commits and blobs, we may as well add them for tags too for completeness. Perhaps someone will have a use for them. This commit teaches a new --show-original-ids option to fast-export which will make it add a 'original-oid ' line to blob, commits, and tags. It also teaches fast-import to parse (and ignore) such lines. Signed-off-by: Elijah Newren --- Documentation/git-fast-export.txt | 7 +++++++ Documentation/git-fast-import.txt | 16 ++++++++++++++++ builtin/fast-export.c | 20 +++++++++++++++----- fast-import.c | 12 ++++++++++++ t/t9350-fast-export.sh | 17 +++++++++++++++++ 5 files changed, 67 insertions(+), 5 deletions(-) diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt index f65026662a..64c01ba918 100644 --- a/Documentation/git-fast-export.txt +++ b/Documentation/git-fast-export.txt @@ -122,6 +122,13 @@ marks the same across runs. repository which already contains the necessary parent commits. +--show-original-ids:: + Add an extra directive to the output for commits and blobs, + `original-oid `. While such directives will likely be + ignored by importers such as git-fast-import, it may be useful + for intermediary filters (e.g. for rewriting commit messages + which refer to older commits, or for stripping blobs by id). + --refspec:: Apply the specified refspec to each ref exported. Multiple of them can be specified. diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 7ab97745a6..43ab3b1637 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -385,6 +385,7 @@ change to the project. .... 'commit' SP LF mark? + original-oid? ('author' (SP )? SP LT GT SP LF)? 'committer' (SP )? SP LT GT SP LF data @@ -741,6 +742,19 @@ New marks are created automatically. Existing marks can be moved to another object simply by reusing the same `` in another `mark` command. +`original-oid` +~~~~~~~~~~~~~~ +Provides the name of the object in the original source control system. +fast-import will simply ignore this directive, but filter processes +which operate on and modify the stream before feeding to fast-import +may have uses for this information + +.... + 'original-oid' SP LF +.... + +where `` is any string not containing LF. + `tag` ~~~~~ Creates an annotated tag referring to a specific commit. To create @@ -749,6 +763,7 @@ lightweight (non-annotated) tags see the `reset` command below. .... 'tag' SP LF 'from' SP LF + original-oid? 'tagger' (SP )? SP LT GT SP LF data .... @@ -823,6 +838,7 @@ assigned mark. .... 'blob' LF mark? + original-oid? data .... diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 3cc98c31ad..e0f794811e 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -38,6 +38,7 @@ static int use_done_feature; static int no_data; static int full_tree; static int reference_excluded_commits; +static int show_original_ids; static struct string_list extra_refs = STRING_LIST_INIT_NODUP; static struct string_list tag_refs = STRING_LIST_INIT_NODUP; static struct refspec refspecs = REFSPEC_INIT_FETCH; @@ -271,7 +272,10 @@ static void export_blob(const struct object_id *oid) mark_next_object(object); - printf("blob\nmark :%"PRIu32"\ndata %lu\n", last_idnum, size); + printf("blob\nmark :%"PRIu32"\n", last_idnum); + if (show_original_ids) + printf("original-oid %s\n", oid_to_hex(oid)); + printf("data %lu\n", size); if (size && fwrite(buf, size, 1, stdout) != 1) die_errno("could not write blob '%s'", oid_to_hex(oid)); printf("\n"); @@ -634,8 +638,10 @@ static void handle_commit(struct commit *commit, struct rev_info *rev, reencoded = reencode_string(message, "UTF-8", encoding); if (!commit->parents) printf("reset %s\n", refname); - printf("commit %s\nmark :%"PRIu32"\n%.*s\n%.*s\ndata %u\n%s", - refname, last_idnum, + printf("commit %s\nmark :%"PRIu32"\n", refname, last_idnum); + if (show_original_ids) + printf("original-oid %s\n", oid_to_hex(&commit->object.oid)); + printf("%.*s\n%.*s\ndata %u\n%s", (int)(author_end - author), author, (int)(committer_end - committer), committer, (unsigned)(reencoded @@ -813,8 +819,10 @@ static void handle_tag(const char *name, struct tag *tag) if (starts_with(name, "refs/tags/")) name += 10; - printf("tag %s\nfrom :%d\n%.*s%sdata %d\n%.*s\n", - name, tagged_mark, + printf("tag %s\nfrom :%d\n", name, tagged_mark); + if (show_original_ids) + printf("original-oid %s\n", oid_to_hex(&tag->object.oid)); + printf("%.*s%sdata %d\n%.*s\n", (int)(tagger_end - tagger), tagger, tagger == tagger_end ? "" : "\n", (int)message_size, (int)message_size, message ? message : ""); @@ -1095,6 +1103,8 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "anonymize", &anonymize, N_("anonymize output")), OPT_BOOL(0, "reference-excluded-parents", &reference_excluded_commits, N_("Reference parents which are not in fast-export stream by sha1sum")), + OPT_BOOL(0, "show-original-ids", &show_original_ids, + N_("Show original sha1sums of blobs/commits")), OPT_END() }; diff --git a/fast-import.c b/fast-import.c index 555d49ad23..71b6cba00f 100644 --- a/fast-import.c +++ b/fast-import.c @@ -1814,6 +1814,13 @@ static void parse_mark(void) next_mark = 0; } +static void parse_original_identifier(void) +{ + const char *v; + if (skip_prefix(command_buf.buf, "original-oid ", &v)) + read_next_command(); +} + static int parse_data(struct strbuf *sb, uintmax_t limit, uintmax_t *len_res) { const char *data; @@ -1956,6 +1963,7 @@ static void parse_new_blob(void) { read_next_command(); parse_mark(); + parse_original_identifier(); parse_and_store_blob(&last_blob, NULL, next_mark); } @@ -2579,6 +2587,7 @@ static void parse_new_commit(const char *arg) read_next_command(); parse_mark(); + parse_original_identifier(); if (skip_prefix(command_buf.buf, "author ", &v)) { author = parse_ident(v); read_next_command(); @@ -2711,6 +2720,9 @@ static void parse_new_tag(const char *arg) die("Invalid ref name or SHA1 expression: %s", from); read_next_command(); + /* original-oid ... */ + parse_original_identifier(); + /* tagger ... */ if (skip_prefix(command_buf.buf, "tagger ", &v)) { tagger = parse_ident(v); diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index d7d73061d0..5690fe2810 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -77,6 +77,23 @@ test_expect_success 'fast-export --reference-excluded-parents master~2..master' test $MASTER = $(git rev-parse --verify refs/heads/rewrite)) ' +test_expect_success 'fast-export --show-original-ids' ' + + git fast-export --show-original-ids master >output && + grep ^original-oid output| sed -e s/^original-oid.// | sort >actual && + git rev-list --objects master muss >objects-and-names && + awk "{print \$1}" objects-and-names | sort >commits-trees-blobs && + comm -23 actual commits-trees-blobs >unfound && + test_must_be_empty unfound +' + +test_expect_success 'fast-export --show-original-ids | git fast-import' ' + + git fast-export --show-original-ids master muss | git fast-import --quiet && + test $MASTER = $(git rev-parse --verify refs/heads/master) && + test $MUSS = $(git rev-parse --verify refs/tags/muss) +' + test_expect_success 'iso-8859-1' ' git config i18n.commitencoding ISO8859-1 && From patchwork Wed Nov 14 00:26:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 10681725 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED8BA139B for ; Wed, 14 Nov 2018 00:26:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DBDC92B644 for ; Wed, 14 Nov 2018 00:26:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D08CA2B684; Wed, 14 Nov 2018 00:26:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CEE9C2B644 for ; Wed, 14 Nov 2018 00:26:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731191AbeKNK1C (ORCPT ); Wed, 14 Nov 2018 05:27:02 -0500 Received: from mx0a-00153501.pphosted.com ([67.231.148.48]:38452 "EHLO mx0a-00153501.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731033AbeKNK1C (ORCPT ); Wed, 14 Nov 2018 05:27:02 -0500 Received: from pps.filterd (m0131697.ppops.net [127.0.0.1]) by mx0a-00153501.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wAE0IbQx024221; Tue, 13 Nov 2018 16:26:04 -0800 Received: from mail.palantir.com ([8.4.231.70]) by mx0a-00153501.pphosted.com with ESMTP id 2nr7by051r-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK); Tue, 13 Nov 2018 16:26:04 -0800 Received: from sj-prod-exch-02.YOJOE.local (10.129.18.29) by sj-prod-exch-02.YOJOE.local (10.129.18.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Tue, 13 Nov 2018 16:25:56 -0800 Received: from smtp-transport.yojoe.local (10.129.56.124) by sj-prod-exch-02.YOJOE.local (10.129.18.29) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Tue, 13 Nov 2018 16:25:56 -0800 Received: from newren2-linux.yojoe.local (newren2-linux.pa.palantir.tech [10.100.71.66]) by smtp-transport.yojoe.local (Postfix) with ESMTPS id 4D5202212292; Tue, 13 Nov 2018 16:26:01 -0800 (PST) From: Elijah Newren To: CC: , , , , , , Elijah Newren Subject: [PATCH v2 11/11] fast-export: add --always-show-modify-after-rename Date: Tue, 13 Nov 2018 16:26:00 -0800 Message-ID: <20181114002600.29233-12-newren@gmail.com> X-Mailer: git-send-email 2.19.1.1063.g2b8e4a4f82.dirty In-Reply-To: <20181114002600.29233-1-newren@gmail.com> References: <20181111062312.16342-1-newren@gmail.com> <20181114002600.29233-1-newren@gmail.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-13_17:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1034 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1811140001 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP I wanted a way to gather all the following information efficiently (with as few history traversals as possible): * Get all blob sizes * Map blob shas to filename(s) they appeared under in the history * Find when files and directories were deleted (and whether they were later reinstated, since that means they aren't actually gone) * Find sets of filenames referring to the same logical 'file'. (e.g. foo->bar in commit A and bar->baz in commit B mean that {foo,bar,baz} refer to the same 'file', so someone wanting to just "keep baz and its history" need all versions of those three filenames). I need to know about things like another foo or bar being introduced after the rename though, since that breaks the connection between filenames) and then I would generate various aggregations on the data and display some type of report for the user. The only way I know of to get blob sizes is via cat-file --batch-all-objects --batch-check The rest of the data would traditionally be gathered from a log command, e.g. git log --format='%H%n%P%n%cd' --date=short --topo-order --reverse \ -M --diff-filter=RAMD --no-abbrev --raw -c however, parsing log output seems slightly dangerous given that it is a porcelain command. While we have specified --format and --raw to try to avoid the most obvious problems, I'm still slightly concerned about --date=short, the combinations of --raw and -c, options that might colorize the output, and also the --diff-filter (there is no current option named --no-find-copies or --no-break-rewrites, but what if those turn on by default in the future much as we changed the default with detecting renames?). Each of those is a small worry, but they add up. A command meant for data serialization, such as fast-export, seems like a better candidate for this job. There's just one missing item: in order to connect blob sizes to filenames, I need fast-export to tell me the blob sha1sum of any file changes. It does this for modifies, but not always for renames. In particular, if a file is a 100% rename, it only prints R oldname newname instead of R oldname newname M 100644 $SHA1 newname as occurs when there is a rename+modify. Add an option which allows us to force the latter output even when commits have exact renames of files. Signed-off-by: Elijah Newren --- Documentation/git-fast-export.txt | 11 ++++++++++ builtin/fast-export.c | 7 +++++- t/t9350-fast-export.sh | 36 +++++++++++++++++++++++++++++++ 3 files changed, 53 insertions(+), 1 deletion(-) diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt index 64c01ba918..b663b6f8af 100644 --- a/Documentation/git-fast-export.txt +++ b/Documentation/git-fast-export.txt @@ -129,6 +129,17 @@ marks the same across runs. for intermediary filters (e.g. for rewriting commit messages which refer to older commits, or for stripping blobs by id). +--always-show-modify-after-rename:: + When a rename is detected, fast-export normally issues both a + 'R' (rename) and a 'M' (modify) directive. However, if the + contents of the old and new filename match exactly, it will + only issue the rename directive. Use this flag to have it + always issue the modify directive after the rename, which may + be useful for tools which are using the fast-export stream as + a mechanism for gathering statistics about a repository. Note + that this option only has effect when rename detection is + active (see the -M option). + --refspec:: Apply the specified refspec to each ref exported. Multiple of them can be specified. diff --git a/builtin/fast-export.c b/builtin/fast-export.c index e0f794811e..31ad43077a 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -38,6 +38,7 @@ static int use_done_feature; static int no_data; static int full_tree; static int reference_excluded_commits; +static int always_show_modify_after_rename; static int show_original_ids; static struct string_list extra_refs = STRING_LIST_INIT_NODUP; static struct string_list tag_refs = STRING_LIST_INIT_NODUP; @@ -407,7 +408,8 @@ static void show_filemodify(struct diff_queue_struct *q, putchar('\n'); if (oideq(&ospec->oid, &spec->oid) && - ospec->mode == spec->mode) + ospec->mode == spec->mode && + !always_show_modify_after_rename) break; } /* fallthrough */ @@ -1105,6 +1107,9 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix) &reference_excluded_commits, N_("Reference parents which are not in fast-export stream by sha1sum")), OPT_BOOL(0, "show-original-ids", &show_original_ids, N_("Show original sha1sums of blobs/commits")), + OPT_BOOL(0, "always-show-modify-after-rename", + &always_show_modify_after_rename, + N_("Always provide 'M' directive after 'R'")), OPT_END() }; diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 5690fe2810..5c20065e39 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -630,4 +630,40 @@ test_expect_success 'merge commit gets exported with --import-marks' ' ) ' +test_expect_success 'rename detection and --always-show-modify-after-rename' ' + test_create_repo renames && + ( + cd renames && + test_seq 0 9 >single_digit && + test_seq 10 98 >double_digit && + git add . && + git commit -m initial && + + echo 99 >>double_digit && + git mv single_digit single-digit && + git mv double_digit double-digit && + git add double-digit && + git commit -m renames && + + # First, check normal fast-export -M output + git fast-export -M --no-data master >out && + + grep double-digit out >out2 && + test_line_count = 2 out2 && + + grep single-digit out >out2 && + test_line_count = 1 out2 && + + # Now, test with --always-show-modify-after-rename; should + # have an extra "M" directive for "single-digit". + git fast-export -M --no-data --always-show-modify-after-rename master >out && + + grep double-digit out >out2 && + test_line_count = 2 out2 && + + grep single-digit out >out2 && + test_line_count = 2 out2 + ) +' + test_done