From patchwork Tue Nov 2 15:46:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= X-Patchwork-Id: 12599155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1FECC433FE for ; Tue, 2 Nov 2021 15:46:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DA06560F70 for ; Tue, 2 Nov 2021 15:46:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234312AbhKBPsv (ORCPT ); Tue, 2 Nov 2021 11:48:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231361AbhKBPsu (ORCPT ); Tue, 2 Nov 2021 11:48:50 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DEDAC061714 for ; Tue, 2 Nov 2021 08:46:15 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id r8so21170409wra.7 for ; Tue, 02 Nov 2021 08:46:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=As5zPCH7OBI2EjZyENdXYBbyh7weut22H3Fe+DIkjnk=; b=GnSfQf/eCejYXiUSe09GG2qUFSniNPZw5NN4AIJkPXyR/a520Vgc9FNnVpdNJ58ZZy O40fH/xnRaXk2aOW+0R6g4xI/TTYR1CZBPAx2gIZUUF6SDwavzQpfqLgoQ0uL16hvlsX njsQiG+xLzawiHSqPDZKSMZK60iR/q405voZa4E3JkL5SXU6DhfMAuSD+qdYmXAnk1++ PKV+Sue3OupYERKQ5czYixf5hzXP0igPpe+Ri75VaGO2KWucnjwNVSezMeLLTuNpFBIB by7jHJmyGtpBYatXo0NIAuQIqHEjo+5nakLaitPYtgH8tNEunEeS2x+i0Kw9USzMW+WK mcRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=As5zPCH7OBI2EjZyENdXYBbyh7weut22H3Fe+DIkjnk=; b=S96Jrn9lX2OQZ+GHSNZIA5XagSr7FlWjNGyh7GXdIb58ITk01Cl0H6i+9ExvEP2Q9v jWZd9eyKl7mSNAsTqJ50dGNhG/8MvDHZCiqbWrqcQBnjInAZaIfKR/y+VNDxuZup2mMj UwBSvSGHoxrx8Yt/sKIt2fx0uAYIvsWSBDyOIz1Ty4tbXo6OC3CqF8uS/RwJbk8Ag9L3 1lfAdOI3bbCsDqSI2LIQ/ll4CkcrsiF0tbduDJyc37kdmcNlytlMJQTOgE4JwRn4UQfq bblGuHuBa/XKWR5TompNTBN6iHSzUcaBU762xFTX5K6jBae3nrJpYMCNUH28dv+JoTnb dSOw== X-Gm-Message-State: AOAM532ioAmZO07dYm5AE5K4SSO/y6YpkOMyqG/IYWmfhTjFAW0/8k7+ x7ckUzz8Tm3ZRQFw/jWbZTnx7OQSDVI= X-Google-Smtp-Source: ABdhPJw19a6rjEsurXwl2goLCaaTCHnwWL+2o0MopPQVG6frFNZshfy/YDIomDhi5Q/TtNU3ez/xBw== X-Received: by 2002:adf:ef84:: with SMTP id d4mr36314334wro.175.1635867973371; Tue, 02 Nov 2021 08:46:13 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f18sm18089287wrg.3.2021.11.02.08.46.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 08:46:13 -0700 (PDT) Message-Id: <068f897b973b1f8889145f97c42fe6233c272dd5.1635867971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 02 Nov 2021 15:46:04 +0000 Subject: [PATCH v4 1/8] test-genzeros: allow more than 2G zeros in Windows MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Philip Oakley , Torsten =?utf-8?q?B=C3=B6gershausen?= , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= d5cfd142ec (tests: teach the test-tool to generate NUL bytes and use it, 2019-02-14), add a way to generate zeroes in a portable way without using /dev/zero (needed by HP NonStop), but uses a long variable that is limited to 2^31 in Windows. Use instead a (POSIX/C99) intmax_t that is at least 64bit wide in 64-bit Windows to use in a future test. Signed-off-by: Carlo Marcelo Arenas Belón Signed-off-by: Johannes Schindelin --- t/helper/test-genzeros.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/helper/test-genzeros.c b/t/helper/test-genzeros.c index 9532f5bac97..b1197e91a89 100644 --- a/t/helper/test-genzeros.c +++ b/t/helper/test-genzeros.c @@ -3,14 +3,14 @@ int cmd__genzeros(int argc, const char **argv) { - long count; + intmax_t count; if (argc > 2) { fprintf(stderr, "usage: %s []\n", argv[0]); return 1; } - count = argc > 1 ? strtol(argv[1], NULL, 0) : -1L; + count = argc > 1 ? strtoimax(argv[1], NULL, 0) : -1; while (count < 0 || count--) { if (putchar(0) == EOF) From patchwork Tue Nov 2 15:46:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12599157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6539C433EF for ; Tue, 2 Nov 2021 15:46:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A0E5460F70 for ; Tue, 2 Nov 2021 15:46:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234361AbhKBPsx (ORCPT ); Tue, 2 Nov 2021 11:48:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234212AbhKBPsu (ORCPT ); Tue, 2 Nov 2021 11:48:50 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD671C061714 for ; Tue, 2 Nov 2021 08:46:15 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id d27so16040371wrb.6 for ; Tue, 02 Nov 2021 08:46:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=C7FXMXvdgKpKIQ79mjqFgY4VI91Yh5GAcfct/8ErWpk=; b=ITmDOIsVr//KZ15rjWSt+MwpzatB0jOvESFGB2gq+yoeDzuqmLMBRIPioG1mGT19ct e0nLIJLomTgUp27KGEVbHk/k26TtVevRnP38wbZh00eCaoU1q1fsEGolVslyu6+HL9P+ lbtUn6a4ZK1koH+Diwv/V9pPZvXiILY4o1SZ3Hyoyr4NifkqOYMUFHrlfFlz4H3uztQM 9ng3s7+0qoGJwNkNdTWaDVmd7c89K8nSWr22xgUV5d0btap/JU5Uyb5AfLTEw3Kbg2i8 DmAqEZSB2Fm5dqbMLykpneE5BaCrD7O4mUo6C2BKwHWSFQlO+VSpo79wYRmhD32Txet9 awgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=C7FXMXvdgKpKIQ79mjqFgY4VI91Yh5GAcfct/8ErWpk=; b=Nncxxdh3MQcSQZ/e07hmTVkAVWPTw773CV/00bCoWflltJDrki6+vAga8I/Sc9CwI6 CBI5YL/nQHchJSoWJXWNVaeTAuxpWRQwaEQdjIaD3fz8gPUZwUyKJV6pIHS5B6ImmN7H 9+jy8z7VGIhppjyw6FY9UeAb2F/p3cKPBf/KOmC8rtQIMNiXP0d5ZAzzqZXhM4sj9EIO 42ZjQiRB2n0aUXF30y7kcDoHWP3/oJXQ2DZqX77UnEc86GeloqHUt6mgPRsrlR6+qktS AtY1bwdaDjpHelImRWc+7NQsAfwmGxa5+W99Vd0+0ZboAbu6PSyL6WmNb7irmdhfr3Dc YdPA== X-Gm-Message-State: AOAM531gZdtD4l3xAxlcPckHQ1YpZxpU/yyvA4sRSEUYt8mTnreneanR fzwoKv40K06xn7RHcUhne3ZMoFnJ7A8= X-Google-Smtp-Source: ABdhPJzQEEuyO1bnESXWSwrQF6ABLvKj/7pYEVHUsASKhuqSApDDoAj2CgTo0Vy+31e+8cLxmTiiCw== X-Received: by 2002:adf:fd90:: with SMTP id d16mr45175506wrr.385.1635867974200; Tue, 02 Nov 2021 08:46:14 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h22sm1609605wmq.14.2021.11.02.08.46.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 08:46:13 -0700 (PDT) Message-Id: <052197200141c321118b7766f5615a61f951e59f.1635867971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 02 Nov 2021 15:46:05 +0000 Subject: [PATCH v4 2/8] test-tool genzeros: generate large amounts of data more efficiently Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Philip Oakley , Torsten =?utf-8?q?B=C3=B6gershausen?= , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin In this developer's tests, producing one gigabyte worth of NULs in a busy loop that writes out individual bytes, unbuffered, took ~27sec. Writing chunked 256kB buffers instead only took ~0.6sec This matters because we are about to introduce a pair of test cases that want to be able to produce 5GB of NULs, and we cannot use `/dev/zero` because of the HP NonStop platform's lack of support for that device. Signed-off-by: Johannes Schindelin --- t/helper/test-genzeros.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/t/helper/test-genzeros.c b/t/helper/test-genzeros.c index b1197e91a89..8ca988d6216 100644 --- a/t/helper/test-genzeros.c +++ b/t/helper/test-genzeros.c @@ -3,7 +3,10 @@ int cmd__genzeros(int argc, const char **argv) { + /* static, so that it is NUL-initialized */ + static const char zeros[256 * 1024]; intmax_t count; + ssize_t n; if (argc > 2) { fprintf(stderr, "usage: %s []\n", argv[0]); @@ -12,9 +15,19 @@ int cmd__genzeros(int argc, const char **argv) count = argc > 1 ? strtoimax(argv[1], NULL, 0) : -1; - while (count < 0 || count--) { - if (putchar(0) == EOF) + /* Writing out individual NUL bytes is slow... */ + while (count < 0) + if (write(1, zeros, ARRAY_SIZE(zeros)) < 0) return -1; + + while (count > 0) { + n = write(1, zeros, count < ARRAY_SIZE(zeros) ? + count : ARRAY_SIZE(zeros)); + + if (n < 0) + return -1; + + count -= n; } return 0; From patchwork Tue Nov 2 15:46:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= X-Patchwork-Id: 12599159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4783C433F5 for ; Tue, 2 Nov 2021 15:46:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8EBDF60F5A for ; Tue, 2 Nov 2021 15:46:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234479AbhKBPsy (ORCPT ); Tue, 2 Nov 2021 11:48:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234343AbhKBPsv (ORCPT ); Tue, 2 Nov 2021 11:48:51 -0400 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDD5CC061714 for ; Tue, 2 Nov 2021 08:46:16 -0700 (PDT) Received: by mail-wm1-x331.google.com with SMTP id 71so13678621wma.4 for ; Tue, 02 Nov 2021 08:46:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=IWZ+wQFISLCSuKsKparVaIZmh4E35bF5Bk/+BvEvB5Q=; b=OUQRCW8hnOQfDQJ4ECOC0+tSfI3NtqRU6XBRbmt9mHLavxX+PNSz6q6rX5gY6vztw9 Ch2Ruxtf7B6Le9J1ZMoEzKhfGSmC8/AIn8KFbA5Kjy1dWkmaHy+3vUxSSgkegXThphdq cBtPEWXVQJRV+yAsrgYKnSb/I9RDkrx9zVNsXvdR3Ghr0yWTEFljd2Fvy9CPr4T4I1yM 58webXl65YzCnGf5o5IhFhMFPnCqWf2FEYgMlykm0zaJZ0fyNYAJdYmf6oD+hTRcUD78 fskAR8HMzJb58Jn3TliWxv4x/imUpJAujXKwZcRaqWdXQsi/50EMDoCp+Ejz3NaCl8M/ q9kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=IWZ+wQFISLCSuKsKparVaIZmh4E35bF5Bk/+BvEvB5Q=; b=a5Z5pr3w1COPONAy0XncOJDDV/JBqOAOaYKT+xg94qUkWIvW2tCMIH3sNb8xFaT0s7 bHlZy5Qqhd6wd/FhU/lzfesYBFpv2VqpwF+MkcRguqaR1Lkl14UjfDA/i4Iy/SgpIDvx k36LwbNzAjS3iFOpoUjrczAqgSPw9J9xXDqhRCrOJIYVH74OPseRO3iJJbB1aPmnjbWu 3cj8mLqpRA7zvi+3HY/1V/87T/UTYn5bHaqCTAetZWE7YE0JMPy6qy3ojEDUhJGzlIM5 NEy0bzOeAYGjCR2CTivwzgjXGpji7nu7QC0/+6De9CnkraVJ8pUSP/LzOxXr2J/WUhpp NLNw== X-Gm-Message-State: AOAM532giyhGSQf9aWIABwU7LrdqdYFrteQTN/fIdjcy7leg/z3WronW SESPvAEYpa9J8VuMir5iJW25kgGOKIw= X-Google-Smtp-Source: ABdhPJyHa5L8vu76igaXdnw6Wq5o33tocO9aud82CKwcpB3zF3Uy3GPDBFBNGsXcFGdrC+9TciovuA== X-Received: by 2002:a05:600c:2f17:: with SMTP id r23mr8279053wmn.93.1635867975006; Tue, 02 Nov 2021 08:46:15 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m21sm12293850wrb.2.2021.11.02.08.46.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 08:46:14 -0700 (PDT) Message-Id: <489500bb1dcaffecab42672658990cfc26d52d7c.1635867971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 02 Nov 2021 15:46:06 +0000 Subject: [PATCH v4 3/8] test-lib: add prerequisite for 64-bit platforms MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Philip Oakley , Torsten =?utf-8?q?B=C3=B6gershausen?= , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit platforms and regardless of the size of `long`. This imitates the `LONG_IS_64BIT` prerequisite. Signed-off-by: Carlo Marcelo Arenas Belón Signed-off-by: Johannes Schindelin --- t/test-lib.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index adaf03543e8..af1a94c2c20 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1642,6 +1642,10 @@ build_option () { sed -ne "s/^$1: //p" } +test_lazy_prereq SIZE_T_IS_64BIT ' + test 8 -eq "$(build_option sizeof-size_t)" +' + test_lazy_prereq LONG_IS_64BIT ' test 8 -le "$(build_option sizeof-long)" ' From patchwork Tue Nov 2 15:46:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Cooper X-Patchwork-Id: 12599161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BFDCC433FE for ; Tue, 2 Nov 2021 15:46:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1999D60F5A for ; Tue, 2 Nov 2021 15:46:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234547AbhKBPs4 (ORCPT ); Tue, 2 Nov 2021 11:48:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234344AbhKBPsw (ORCPT ); Tue, 2 Nov 2021 11:48:52 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45808C0613F5 for ; Tue, 2 Nov 2021 08:46:17 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id y84-20020a1c7d57000000b00330cb84834fso2332738wmc.2 for ; Tue, 02 Nov 2021 08:46:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=M2zhmpqsg9dQYn8ZojYNuH0ClVlSvTbD3hvmooXuDQA=; b=jHS5ZJBB5iFUAmqce6W7YYq0TGsymLrfXki1gXm9hGVwaLgcthlX1LzUaCrgidjfZV FjVTgKIU7q6M7KYNCocsOq4URJtkwe+0ttO974kJLJdoK928VnI2Q4pyz9b2IMJaO4DN whLWYK7PwREG4iysh6GHFRzhzadOsNCnivDkWJDk9m2Txi3ZSy+bbCEta6du+ha3meTs hgR3CWfLNMFaX3pzpfTUYfl/f8MXZ0WUl3TGZdoYID69FX9vL32wngjMum9IDYMxW0PM I7V5AajkC1MA8jCBgAsRBZexGfiVZOa2+WBDQ994UfoA6O+hmmblMgvAbZkJ7LnpZkcE i42w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=M2zhmpqsg9dQYn8ZojYNuH0ClVlSvTbD3hvmooXuDQA=; b=k14XP/lKiH5s7aSNtCNH+o4aJYzlwtmY2GlJMkZ2Z9twh3x/WmwcIcWwGuom14h54F reTyKCmwGAQQDgXglkTO9NM0WXxfUQZ8mXgglryjySa8MbbKow61JS4eTRlWe4mei9Qt w0+I7LjQCE/H8dcJPvltkhQXmn1BHoCOyBG5sIe9DiCiBnFurju+LF+SZ4gFi3zfA+0d DTcLyG9KpN8DXhw/eNuSJgBNiT2v17QhabKwdl255CLBtfbhGiJW5VR/ZErVblHGGrkh tQVJGYyMEKLkb/6nUsyz6JUFhoN8TGsr1gpTCQ3lSmiE5z7Is2Hd+R7y+Nfp4wVgv7oF CtRw== X-Gm-Message-State: AOAM532FeJrYNxNSeUuGB7XvQt2Oz7sMAWuVAB5G8CUxnH9///8qyfhm YN5anWDHYB3WtZTkOwGXEzeM8NBXgXk= X-Google-Smtp-Source: ABdhPJxm0qLojdOBVcNM895KVHbwHJW8iIYvHEMDE7JRPnKvEwrIrLxX0b2tSS3yTkQ7+WYh72To+w== X-Received: by 2002:a1c:35c4:: with SMTP id c187mr6425154wma.193.1635867975761; Tue, 02 Nov 2021 08:46:15 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z5sm3803570wmp.26.2021.11.02.08.46.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 08:46:15 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 02 Nov 2021 15:46:07 +0000 Subject: [PATCH v4 4/8] t1051: introduce a smudge filter test for extremely large files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Philip Oakley , Torsten =?utf-8?q?B=C3=B6gershausen?= , Johannes Schindelin , Matt Cooper Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matt Cooper From: Matt Cooper The filter system allows for alterations to file contents when they're added to the database or working tree. ("Smudge" when moving to the working tree; "clean" when moving to the database.) This is used natively to handle CRLF to LF conversions. It's also employed by Git-LFS to replace large files from the working tree with small tracking files in the repo and vice versa. Git reads the entire smudged file into memory to convert it into a "clean" form to be used in-core. While this is inefficient, there's a more insidious problem on some platforms due to inconsistency between using unsigned long and size_t for the same type of data (size of a file in bytes). On most 64-bit platforms, unsigned long is 64 bits, and size_t is typedef'd to unsigned long. On Windows, however, unsigned long is only 32 bits (and therefore on 64-bit Windows, size_t is typedef'd to unsigned long long in order to be 64 bits). Practically speaking, this means 64-bit Windows users of Git-LFS can't handle files larger than 2^32 bytes. Other 64-bit platforms don't suffer this limitation. This commit introduces a test exposing the issue; future commits make it pass. The test simulates the way Git-LFS works by having a tiny file checked into the repository and expanding it to a huge file on checkout. Helped-by: Johannes Schindelin Signed-off-by: Matt Cooper Signed-off-by: Johannes Schindelin --- t/t1051-large-conversion.sh | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/t/t1051-large-conversion.sh b/t/t1051-large-conversion.sh index 8b7640b3ba8..e7f9f0bdc56 100755 --- a/t/t1051-large-conversion.sh +++ b/t/t1051-large-conversion.sh @@ -83,4 +83,19 @@ test_expect_success 'ident converts on output' ' test_cmp small.clean large.clean ' +# This smudge filter prepends 5GB of zeros to the file it checks out. This +# ensures that smudging doesn't mangle large files on 64-bit Windows. +test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB convert on output' ' + test_commit test small "a small file" && + small_size=$(test_file_size small) && + test_config filter.makelarge.smudge \ + "test-tool genzeros $((5*1024*1024*1024)) && cat" && + echo "small filter=makelarge" >.gitattributes && + rm small && + git checkout -- small && + size=$(test_file_size small) && + test "$size" -eq $((5 * 1024 * 1024 * 1024 + $small_size)) +' + test_done From patchwork Tue Nov 2 15:46:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Cooper X-Patchwork-Id: 12599163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F720C433EF for ; Tue, 2 Nov 2021 15:46:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27711604AC for ; Tue, 2 Nov 2021 15:46:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234402AbhKBPs4 (ORCPT ); Tue, 2 Nov 2021 11:48:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234360AbhKBPsx (ORCPT ); Tue, 2 Nov 2021 11:48:53 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16361C061714 for ; Tue, 2 Nov 2021 08:46:18 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id z200so15113708wmc.1 for ; Tue, 02 Nov 2021 08:46:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ulMMWzy7rXqFFjJnM3j2i3iw89BF4CDsIbfxaR7WZ/I=; b=L7VshczPcJBLsa9RzG8CKnZZ/cILXnO5ZeOxwPKfEvdpleotbrv2jq/5lhftuHxiLU OlpJc3zwPlNYGdP4sqvJuzP+eEGn0nHv/f4qDsczK5jQyzinWXBcJCamxnHpL8KVSbPF a3j4KkE9wvjGYkbUfGzdKTKivXpTtIizo+79WGsUMVersVE+2dlPFn01jBw0VuU+SXg2 rW6GZ4KfWcGoBnDJEgA/nFrDP3YSXI0mHZb4HvkpLpZnqoq/itNaNR8XJNhgXP5qbPc1 VBKjw2D7os8LJHIyDtqJkUpCEtMHZNY+DGjxe3JYoXgFm3XMUO66DJDBcT6C/Hjbc52Z n5fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ulMMWzy7rXqFFjJnM3j2i3iw89BF4CDsIbfxaR7WZ/I=; b=yC3gClbLN0NGr39naYWwE/LDpJ6FnfersvSQh8tftVTRHtlovqRca0dbrJm2T2APG6 xSMFO39G1f/LqNBXFHGWTPFRZF2XnLUnoExrWH84tlhw6eOVNo1DNLSytZjbkFXI1aA7 gNpQx0QsSlxZ5rY6kfiotcV3lsoKwzfpD4WBMoS9G01ly6nRuIxowei+uckpuoNo++6L BhesvQ3m/yHm1admFA8ZtcUvydFu7pNSBhKlpgMlskQZerwt5p8qe3EL6yYHiG6EwDcP Ke55EXjrFMeneylq0hJvIHqej9VALlu09FzwWaBzURNkOAFY+zFjKP4y42k2UFGP5aBf zH5A== X-Gm-Message-State: AOAM532G3QXfbF1Erhngq/J3ylZzDCJfuqrblALIeztkLRFVbC99wN8F vVgNNMmf5CtC/0Ue3hNT0TDxyitiznU= X-Google-Smtp-Source: ABdhPJxQpiOBY3UgdkY8QakESHHETG1JX4oyYJmsZpkTTbzXMqGu8kKFJLNFG+kg8YlcCKBOlEiVNA== X-Received: by 2002:a05:600c:4a27:: with SMTP id c39mr8149801wmp.101.1635867976456; Tue, 02 Nov 2021 08:46:16 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z6sm3628613wmp.1.2021.11.02.08.46.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 08:46:16 -0700 (PDT) Message-Id: <308a8f2a3ade63ef21feb945e45866f2a83ae101.1635867971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 02 Nov 2021 15:46:08 +0000 Subject: [PATCH v4 5/8] odb: teach read_blob_entry to use size_t Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Philip Oakley , Torsten =?utf-8?q?B=C3=B6gershausen?= , Johannes Schindelin , Matt Cooper Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matt Cooper From: Matt Cooper There is mixed use of size_t and unsigned long to deal with sizes in the codebase. Recall that Windows defines unsigned long as 32 bits even on 64-bit platforms, meaning that converting size_t to unsigned long narrows the range. This mostly doesn't cause a problem since Git rarely deals with files larger than 2^32 bytes. But adjunct systems such as Git LFS, which use smudge/clean filters to keep huge files out of the repository, may have huge file contents passed through some of the functions in entry.c and convert.c. On Windows, this results in a truncated file being written to the workdir. I traced this to one specific use of unsigned long in write_entry (and a similar instance in write_pc_item_to_fd for parallel checkout). That appeared to be for the call to read_blob_entry, which expects a pointer to unsigned long. By altering the signature of read_blob_entry to expect a size_t, write_entry can be switched to use size_t internally (which all of its callers and most of its callees already used). To avoid touching dozens of additional files, read_blob_entry uses a local unsigned long to call a chain of functions which aren't prepared to accept size_t. Helped-by: Johannes Schindelin Signed-off-by: Matt Cooper Signed-off-by: Johannes Schindelin --- entry.c | 8 +++++--- entry.h | 2 +- parallel-checkout.c | 2 +- t/t1051-large-conversion.sh | 2 +- 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/entry.c b/entry.c index 711ee0693c7..4cb3942dbdc 100644 --- a/entry.c +++ b/entry.c @@ -82,11 +82,13 @@ static int create_file(const char *path, unsigned int mode) return open(path, O_WRONLY | O_CREAT | O_EXCL, mode); } -void *read_blob_entry(const struct cache_entry *ce, unsigned long *size) +void *read_blob_entry(const struct cache_entry *ce, size_t *size) { enum object_type type; - void *blob_data = read_object_file(&ce->oid, &type, size); + unsigned long ul; + void *blob_data = read_object_file(&ce->oid, &type, &ul); + *size = ul; if (blob_data) { if (type == OBJ_BLOB) return blob_data; @@ -270,7 +272,7 @@ static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca int fd, ret, fstat_done = 0; char *new_blob; struct strbuf buf = STRBUF_INIT; - unsigned long size; + size_t size; ssize_t wrote; size_t newsize = 0; struct stat st; diff --git a/entry.h b/entry.h index b8c0e170dc7..61ee8c17604 100644 --- a/entry.h +++ b/entry.h @@ -51,7 +51,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); */ void unlink_entry(const struct cache_entry *ce); -void *read_blob_entry(const struct cache_entry *ce, unsigned long *size); +void *read_blob_entry(const struct cache_entry *ce, size_t *size); int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st); void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, struct stat *st); diff --git a/parallel-checkout.c b/parallel-checkout.c index 6b1af32bb3d..b6f4a25642e 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -261,7 +261,7 @@ static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd, struct stream_filter *filter; struct strbuf buf = STRBUF_INIT; char *blob; - unsigned long size; + size_t size; ssize_t wrote; /* Sanity check */ diff --git a/t/t1051-large-conversion.sh b/t/t1051-large-conversion.sh index e7f9f0bdc56..e6d52f98b15 100755 --- a/t/t1051-large-conversion.sh +++ b/t/t1051-large-conversion.sh @@ -85,7 +85,7 @@ test_expect_success 'ident converts on output' ' # This smudge filter prepends 5GB of zeros to the file it checks out. This # ensures that smudging doesn't mangle large files on 64-bit Windows. -test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ 'files over 4GB convert on output' ' test_commit test small "a small file" && small_size=$(test_file_size small) && From patchwork Tue Nov 2 15:46:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12599165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A535CC433F5 for ; Tue, 2 Nov 2021 15:46:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8A2B3604AC for ; Tue, 2 Nov 2021 15:46:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232672AbhKBPtC (ORCPT ); Tue, 2 Nov 2021 11:49:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234450AbhKBPsy (ORCPT ); Tue, 2 Nov 2021 11:48:54 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FC23C06120A for ; Tue, 2 Nov 2021 08:46:18 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id d27so16040637wrb.6 for ; Tue, 02 Nov 2021 08:46:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=V3UJ7kXSNpQVGVtTGUng6sI3LTiP2EGs8G33fVBLwWI=; b=hXz+sL6pWNdNKIcZt9dKnGYRZZjkNdQBZ+7/wt7RwmAqLIxgplNRagksCnSuhy2Z87 OGGSyezypzFDkY/9zUAq5YEs/dGK42VciMUb7MNj5b+uhzHoJ+bJQu3FvyiQ+O/A/7Hv oHX2fR8oPcVO4N0D8gJjm36gPuJT4p5hNXtsWY/UbrhLaW7xhu2vlXZ9JGE7QFiqIxqO d23ppD44tO3A5WiBMkIYK//nwuhqGSpUpneBT6o0nJQ+INnU0/59m8cWmLTDqK4ZCAOl 1yJkkRUDY///R6TtmmREgsXNxNSbNYXuCMhRMwVgCFUEgEdhgCkYpswbZ1/cct4AjpHF WV1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=V3UJ7kXSNpQVGVtTGUng6sI3LTiP2EGs8G33fVBLwWI=; b=ymBzrHFY2CgVoGB7Bu73Jc5nfNEbtU4wEmhQIS6X08pHea/fZ1xFTu9pQRTn1iHR5C aQt7ZUloNK8R5ILXUfqPICzKBO2K4nZ3zFSsL0DZO34+L8mFGynUbrMDp0LsQYgszI6Y rfAdQtWb+0MaQOT4RXmZaVhy/7JjNJbkmLJwRD0Lb54BvWJHyai9yMLTxjqm1J8Ebmt8 spYR6BzA+2uwnDNJdmiTWPHLQmxWok0KFzriYPnLiMFspkaJ2qS1zl37etavlsgmHC6/ kQawnWZ7ud47yFY7l42dJD08aPUeytNBwUrZmB9wYIDBiSh8triQt/4ZXFoUTt154sVI cFfg== X-Gm-Message-State: AOAM530t2ez0pMtPuNdzocZ71aGTd3m6aartDi75GaXDfLh0KgFYhIxR UjiVGiOo5GUs+Th9k4IF/qS2aGCRwcg= X-Google-Smtp-Source: ABdhPJwocY3agmk9edwQLkxvDlcPIjImAvHiiLfbCyyi85dzH5LOWkiiHGiJdP94fYgvoAHku+O1Dg== X-Received: by 2002:a5d:4143:: with SMTP id c3mr28489349wrq.254.1635867977320; Tue, 02 Nov 2021 08:46:17 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k10sm2859942wmr.32.2021.11.02.08.46.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 08:46:16 -0700 (PDT) Message-Id: <65bc291b680a30a63cf62a1dc5411ab47e67ae09.1635867971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 02 Nov 2021 15:46:09 +0000 Subject: [PATCH v4 6/8] git-compat-util: introduce more size_t helpers Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Philip Oakley , Torsten =?utf-8?q?B=C3=B6gershausen?= , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin We will use them in the next commit. Signed-off-by: Johannes Schindelin --- git-compat-util.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index a508dbe5a35..1f41e5611a1 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -113,6 +113,14 @@ #define unsigned_mult_overflows(a, b) \ ((a) && (b) > maximum_unsigned_value_of_type(a) / (a)) +/* + * Returns true if the left shift of "a" by "shift" bits will + * overflow. The type of "a" must be unsigned. + */ +#define unsigned_left_shift_overflows(a, shift) \ + ((shift) < bitsizeof(a) && \ + (a) > maximum_unsigned_value_of_type(a) >> (shift)) + #ifdef __GNUC__ #define TYPEOF(x) (__typeof__(x)) #else @@ -859,6 +867,23 @@ static inline size_t st_sub(size_t a, size_t b) return a - b; } +static inline size_t st_left_shift(size_t a, unsigned shift) +{ + if (unsigned_left_shift_overflows(a, shift)) + die("size_t overflow: %"PRIuMAX" << %u", + (uintmax_t)a, shift); + return a << shift; +} + +static inline unsigned long cast_size_t_to_ulong(size_t a) +{ + if (a != (unsigned long)a) + die("object too large to read on this platform: %" + PRIuMAX" is cut off to %lu", + (uintmax_t)a, (unsigned long)a); + return (unsigned long)a; +} + #ifdef HAVE_ALLOCA_H # include # define xalloca(size) (alloca(size)) From patchwork Tue Nov 2 15:46:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Cooper X-Patchwork-Id: 12599167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8BDBC433EF for ; Tue, 2 Nov 2021 15:46:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 98A6B60F5A for ; Tue, 2 Nov 2021 15:46:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234669AbhKBPtG (ORCPT ); Tue, 2 Nov 2021 11:49:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234537AbhKBPsz (ORCPT ); Tue, 2 Nov 2021 11:48:55 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E4B8C06120D for ; Tue, 2 Nov 2021 08:46:20 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id t30so9168166wra.10 for ; Tue, 02 Nov 2021 08:46:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=oUOtW+VMfF77w6KugTIk7kk1SNVq3GaMJS9+jVTFH/w=; b=pcgwjTqfpujds5kEDC67I4fRtQRc+xn+3GLP13J3RBz3vmSErEhpvR+UyE2sBGUfcl hT2qQFOC4DhlrSlRiWf/S8Nlc9RBXSiueZ5aTkQJON4FuRSYc1mHGbUqL0Rw4kCEyjcE xMa95F/jiTpeeJzcLNxmJeYg2qCiWRokHW3x9LTsJw8hApCEUp08DS9Pgj/KPYngfZYs Wt5x2e9R/jKs0QQE7SVd5LiDlJbgmREWpqe5cyBlolUOaXNLqeo4mg7Dx9Ia90BfBFDH sScjdR5oQkE02Fq3Mn4EWdEHH/sr4xt3lMkWnrq5wcRtcyu7YVcSzI1eRHpQ4oQvOPzA 10wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=oUOtW+VMfF77w6KugTIk7kk1SNVq3GaMJS9+jVTFH/w=; b=BeB6059zshronmYZPTZTtVpLYXqiqBSHogI6BZ5Mhfk9sk1+GbycC4Mh/4tMUnzR+S xcaixg/u/YA/bk44g2gkmyAS/2hNst4FdHAuhPpO5TtFmK6IWMX4ebxtoG22Rzg2Gh9g lo0aoBl4C0zoiMDd9cValL/jkxkByGEWcaG0p37tfzX+jKakoAJlym3sHwlf/HJhTrnM J9HjfWhtsI3hrsqUx4chf89ZjzKsU7YcF/koDKuiANhNtPG+5eE/lAdYIZXXqTNU45Lq Z1A5aEyWzr0ujW/gc5BZCfXl98H7v6v8xImTSGry6BO4lXvOeGybhHjeViBpXMILkiT4 DkbA== X-Gm-Message-State: AOAM5330XlYUKioNOhQUyDEpqKPUzadzTL11Xoth1cHMxX7irKVIdnVX DCH0sduWszkUm2NdgTvily5byM8ocD8= X-Google-Smtp-Source: ABdhPJzzBkMQVPWnh1/FdYd4FfCVCNyl9xXfLEQhUu5UpLNiQ5GajBuO/SzfFiXNxg9L29MTHXpt7A== X-Received: by 2002:a05:6000:154f:: with SMTP id 15mr35522306wry.74.1635867978186; Tue, 02 Nov 2021 08:46:18 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r7sm9304977wrq.29.2021.11.02.08.46.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 08:46:17 -0700 (PDT) Message-Id: <7b6655f03f50925747fa9870333bab39a690d3f8.1635867971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 02 Nov 2021 15:46:10 +0000 Subject: [PATCH v4 7/8] odb: guard against data loss checking out a huge file Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Philip Oakley , Torsten =?utf-8?q?B=C3=B6gershausen?= , Johannes Schindelin , Matt Cooper Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matt Cooper From: Matt Cooper This introduces an additional guard for platforms where `unsigned long` and `size_t` are not of the same size. If the size of an object in the database would overflow `unsigned long`, instead we now exit with an error. A complete fix will have to update _many_ other functions throughout the codebase to use `size_t` instead of `unsigned long`. It will have to be implemented at some stage. This commit puts in a stop-gap for the time being. Helped-by: Johannes Schindelin Signed-off-by: Matt Cooper Signed-off-by: Johannes Schindelin --- delta.h | 6 +++--- object-file.c | 6 +++--- packfile.c | 6 +++--- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/delta.h b/delta.h index 2df5fe13d95..8a56ec07992 100644 --- a/delta.h +++ b/delta.h @@ -90,15 +90,15 @@ static inline unsigned long get_delta_hdr_size(const unsigned char **datap, const unsigned char *top) { const unsigned char *data = *datap; - unsigned long cmd, size = 0; + size_t cmd, size = 0; int i = 0; do { cmd = *data++; - size |= (cmd & 0x7f) << i; + size |= st_left_shift(cmd & 0x7f, i); i += 7; } while (cmd & 0x80 && data < top); *datap = data; - return size; + return cast_size_t_to_ulong(size); } #endif diff --git a/object-file.c b/object-file.c index f233b440b22..70e456fc2a3 100644 --- a/object-file.c +++ b/object-file.c @@ -1344,7 +1344,7 @@ static int parse_loose_header_extended(const char *hdr, struct object_info *oi, unsigned int flags) { const char *type_buf = hdr; - unsigned long size; + size_t size; int type, type_len = 0; /* @@ -1388,12 +1388,12 @@ static int parse_loose_header_extended(const char *hdr, struct object_info *oi, if (c > 9) break; hdr++; - size = size * 10 + c; + size = st_add(st_mult(size, 10), c); } } if (oi->sizep) - *oi->sizep = size; + *oi->sizep = cast_size_t_to_ulong(size); /* * The length must be followed by a zero byte diff --git a/packfile.c b/packfile.c index 755aa7aec5e..3ccea004396 100644 --- a/packfile.c +++ b/packfile.c @@ -1059,7 +1059,7 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep) { unsigned shift; - unsigned long size, c; + size_t size, c; unsigned long used = 0; c = buf[used++]; @@ -1073,10 +1073,10 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf, break; } c = buf[used++]; - size += (c & 0x7f) << shift; + size = st_add(size, st_left_shift(c & 0x7f, shift)); shift += 7; } - *sizep = size; + *sizep = cast_size_t_to_ulong(size); return used; } From patchwork Tue Nov 2 15:46:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Cooper X-Patchwork-Id: 12599169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1EFBC433EF for ; Tue, 2 Nov 2021 15:46:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8A2BA61050 for ; Tue, 2 Nov 2021 15:46:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234360AbhKBPtO (ORCPT ); Tue, 2 Nov 2021 11:49:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234581AbhKBPs4 (ORCPT ); Tue, 2 Nov 2021 11:48:56 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B668CC06120E for ; Tue, 2 Nov 2021 08:46:20 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id v127so15931249wme.5 for ; Tue, 02 Nov 2021 08:46:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=MfWB6/O/XDWVBgnDBb7J+s3FjcbAOYL7y65rKAWAVTU=; b=TXR0arSRK9aFg4J8cTZcYaDYF3SjhQdYInT0tFQKqHGHGPtuM4dp6xQeyn7BLpwvfW Z7ydksBYaOUyWl6zV/Clue19opfeKHRbwX/WglcpZj2H5YHFvYCADAPq3yNMkJ66fAcM 5gOP62gJUQEqNuYr3QqTknpH7v4ZLcepBQhFKvdPoDbm+QRTJuWdx+1dxW9dcHkZbSeZ hlPaPUR34Dnmtef//EUl2G8w+OGpjKglx0aXLOG9MSFhYv+N8T8LbajeBDbBckRTBgN1 fcQd1ADeXitlc/lb8LqwXpIHArUr7yR9ThLliQR4e4EGCTu1wD6EFRW4+xpbithKI7A9 +Kww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=MfWB6/O/XDWVBgnDBb7J+s3FjcbAOYL7y65rKAWAVTU=; b=iUNy2WviAwv1PeLf1+i/ao4+JT1Anywjb4j+BtOVoRYtNIelRtLSWjbnVCnhBApenv 0ZbXvq7IwZ+rZcYD3dul2VmZnW28EhFAHgnjJl60i1JZpPH3UOSv+a3msAevwqu4gMdX Cjv/PZwdJH4i//iVY/NJkfIYCpdbPNpuPKn0SqnDEYUuz+6pFr+yUgEgQhV5wI1Y02Y8 VuAUTINYdh8pg11c04NX2sp0qHyr45fog46ZgJoULyyJlwAVVE0tl98YkealKjbg54f2 MLyVPx29Nl94Pqo4hnGMw64vCMkiOau5OyPAOjGWC/Pw8YUrB8PyzOZK9Oe7/+2oeKcG b+ig== X-Gm-Message-State: AOAM5334mCKP39cFK78CSyQhWx2+Gcp7BU1/cRGBFniUAqwnhPcFgGGV LZwU4uotRiSrL/oAN3dG30X0OKBK1Hs= X-Google-Smtp-Source: ABdhPJwpV+R7wna4pZgBvMg2t4IS6UBlNerU4s9yv4fqn8vDgryAPBo1sV5QZyB8CtKqPIu1SrmNEg== X-Received: by 2002:a1c:a904:: with SMTP id s4mr7738903wme.163.1635867979175; Tue, 02 Nov 2021 08:46:19 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v3sm17993076wrg.23.2021.11.02.08.46.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Nov 2021 08:46:18 -0700 (PDT) Message-Id: <41fda423982d99847d3879f5ea1eb3570ae9eab6.1635867971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 02 Nov 2021 15:46:11 +0000 Subject: [PATCH v4 8/8] clean/smudge: allow clean filters to process extremely large files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Philip Oakley , Torsten =?utf-8?q?B=C3=B6gershausen?= , Johannes Schindelin , Matt Cooper Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matt Cooper From: Matt Cooper The filter system allows for alterations to file contents when they're moved between the database and the worktree. We already made sure that it is possible for smudge filters to produce contents that are larger than `unsigned long` can represent (which matters on systems where `unsigned long` is narrower than `size_t`, most notably 64-bit Windows). Now we make sure that clean filters can _consume_ contents that are larger than that. Note that this commit only allows clean filters' _input_ to be larger than can be represented by `unsigned long`. This change makes only a very minute dent into the much larger project to teach Git to use `size_t` instead of `unsigned long` wherever appropriate. Helped-by: Johannes Schindelin Signed-off-by: Matt Cooper Signed-off-by: Johannes Schindelin --- convert.c | 2 +- t/t1051-large-conversion.sh | 11 +++++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/convert.c b/convert.c index fd9c84b0257..5ad6dfc08a0 100644 --- a/convert.c +++ b/convert.c @@ -613,7 +613,7 @@ static int crlf_to_worktree(const char *src, size_t len, struct strbuf *buf, struct filter_params { const char *src; - unsigned long size; + size_t size; int fd; const char *cmd; const char *path; diff --git a/t/t1051-large-conversion.sh b/t/t1051-large-conversion.sh index e6d52f98b15..042b0e44292 100755 --- a/t/t1051-large-conversion.sh +++ b/t/t1051-large-conversion.sh @@ -98,4 +98,15 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ test "$size" -eq $((5 * 1024 * 1024 * 1024 + $small_size)) ' +# This clean filter writes down the size of input it receives. By checking against +# the actual size, we ensure that cleaning doesn't mangle large files on 64-bit Windows. +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB convert on input' ' + test-tool genzeros $((5*1024*1024*1024)) >big && + test_config filter.checklarge.clean "wc -c >big.size" && + echo "big filter=checklarge" >.gitattributes && + git add big && + test $(test_file_size big) -eq $(cat big.size) +' + test_done