From patchwork Fri Oct 29 13:59:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= X-Patchwork-Id: 12592741 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A414C433EF for ; Fri, 29 Oct 2021 13:59:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3BF096115C for ; Fri, 29 Oct 2021 13:59:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231468AbhJ2OBw (ORCPT ); Fri, 29 Oct 2021 10:01:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230296AbhJ2OBv (ORCPT ); Fri, 29 Oct 2021 10:01:51 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A5479C061714 for ; Fri, 29 Oct 2021 06:59:22 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id c71-20020a1c9a4a000000b0032cdcc8cbafso4803304wme.3 for ; Fri, 29 Oct 2021 06:59:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=As5zPCH7OBI2EjZyENdXYBbyh7weut22H3Fe+DIkjnk=; b=E2IDw4S4wjOSo2unha1Fyn1cQ3cY2chAwuc0U4ap5z+VvNtJSp3xxxY0+xUtwChhUA Y2IxhJ954eGtzW5nPo8C6jc/rJeJW254Hpf0kvgXD5D9mLGazNkJuOT5LiDuDlsWuGfZ CYWFFO+v4mNsZQ3cpQA8rv4AsIrcYqfx3avm45ihmfUzGAKeHq/QHpCL5cXp9yysHClh CIsfG3MMlFMnYI/aLJFIanF+HaHJ7vDkQn/oLFxFHU80E3/xwSHAMclBehe7NhB+v3Xm nS3JyMRl3Yrx8uR7+dhx8M2RgolokV6YCPY0g3BLVBrjStkYRhzFDvMbCORjINBBX+zz cKCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=As5zPCH7OBI2EjZyENdXYBbyh7weut22H3Fe+DIkjnk=; b=LU45kuSA1576M5frryF2k4IT5bSVh/6l1aSIyWfHDiChSVJfed20CuxP0HXqDlp//U Xx8JogFp6fR20dsPT4nmEnStLIpR1wbPllWrxsWtmEypzRM1qYwwJANJLT4IFuf+Ai+E 2ESSQWyz+Hn0jNmD7+XjWZAED7S1JY58Fz3IpCtQXObO+5Ksfcgoy47EYxemQqfvfTFw DWctToMdUa+FeVp2LqCvLXrcvTd7WRpG36nIt24LlSShc7oaFwVwKDLQSmd+C448WDn2 P4Wj4vXNCSVvquZUwpH1UtUhFiZ5IL9VEWx4YL7Xo5rhdPYrawoYwEMIqpBshgPMPo4Z ASbw== X-Gm-Message-State: AOAM531GIxxGQ4Cb3wnUm45tWuM/sOsjNq7u7ounnADIm3LXYIhKuwaE qR3u/I9ehCfcgXlNIK9/gTYvdLGIC9M= X-Google-Smtp-Source: ABdhPJyTuJJLjvARdIDEDX4q5u9AxP2awNWd7/9RtDulFrrRuPXCeOOjkEGv6klGMKi2OY/gOVWwpA== X-Received: by 2002:a05:600c:4308:: with SMTP id p8mr11642163wme.159.1635515961338; Fri, 29 Oct 2021 06:59:21 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l11sm4565265wrp.61.2021.10.29.06.59.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 06:59:21 -0700 (PDT) Message-Id: <068f897b973b1f8889145f97c42fe6233c272dd5.1635515959.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 29 Oct 2021 13:59:12 +0000 Subject: [PATCH v3 1/8] test-genzeros: allow more than 2G zeros in Windows MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= d5cfd142ec (tests: teach the test-tool to generate NUL bytes and use it, 2019-02-14), add a way to generate zeroes in a portable way without using /dev/zero (needed by HP NonStop), but uses a long variable that is limited to 2^31 in Windows. Use instead a (POSIX/C99) intmax_t that is at least 64bit wide in 64-bit Windows to use in a future test. Signed-off-by: Carlo Marcelo Arenas Belón Signed-off-by: Johannes Schindelin --- t/helper/test-genzeros.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/helper/test-genzeros.c b/t/helper/test-genzeros.c index 9532f5bac97..b1197e91a89 100644 --- a/t/helper/test-genzeros.c +++ b/t/helper/test-genzeros.c @@ -3,14 +3,14 @@ int cmd__genzeros(int argc, const char **argv) { - long count; + intmax_t count; if (argc > 2) { fprintf(stderr, "usage: %s []\n", argv[0]); return 1; } - count = argc > 1 ? strtol(argv[1], NULL, 0) : -1L; + count = argc > 1 ? strtoimax(argv[1], NULL, 0) : -1; while (count < 0 || count--) { if (putchar(0) == EOF) From patchwork Fri Oct 29 13:59:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12592745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18E89C433EF for ; Fri, 29 Oct 2021 13:59:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0060A61100 for ; Fri, 29 Oct 2021 13:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231559AbhJ2OBy (ORCPT ); Fri, 29 Oct 2021 10:01:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231328AbhJ2OBw (ORCPT ); Fri, 29 Oct 2021 10:01:52 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6119DC061766 for ; Fri, 29 Oct 2021 06:59:23 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id p14so16254285wrd.10 for ; Fri, 29 Oct 2021 06:59:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=C7FXMXvdgKpKIQ79mjqFgY4VI91Yh5GAcfct/8ErWpk=; b=EeaJ/jWACC6I6Zmib6cKYOzYuDZS21RkGYkRPtzIwLvXtzBBkvJzrbJ71aOeyR8a+K x03GgbCZ9snyrO5fSxVETqyM/3m5VxSW8IzSQlT4BcBvnwxQSuuaZizaUmQYV4ns8dg7 FQlGpNCC0Za5DIQ/3QRFQPmoh+/4cb0txCCKE0Rx/9JGUVMWuI/1BpOivbROvmNYptTQ 9KdWbjBtX9dzc2ajjdmnY4cF+cs/RmLnci9+vcLfrvtIJHsfQxLbjeVJ/wwZnAkdY8V9 g9vo1zEL+H77D5j7AbjHZpOcJ56jpamvVoDEZ96Gvxs1+rrj3YpisSv3xEcM9N3MQd0H wTUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=C7FXMXvdgKpKIQ79mjqFgY4VI91Yh5GAcfct/8ErWpk=; b=BVV/7+XHCF+9JcrEKwr3wgQMNSaDvHlKTx77XV9OkVSjmbXKnaJK1wNaLazbYo4lmw mDxcXieYaggLl3mO63dzdVKq0Nk2vo9L6jVuUkGy6zJt4oX0A+ATpivHRdEdjvifPQSU r5kwtbVec0HW4i/jo3jkcAefYbWeMLowFxKKsFJGpo2IjxVgtUNP5Xskx10h6lvWWyoV gw3W0OjmbxrJNq4q4W3ghXKtnvG1QrvE5AdI/Bh/UhhLiwIiPGYsix88Yi+GgvFAdNck WP0D2mytgGVCAS/f0pEKdEM0FNmTA1qVFbpmAHpPxW+hP2EWe7NgGu/BX7DIx+DNdgjz Y1Ew== X-Gm-Message-State: AOAM532nvU1uXNpiSyocJaYlQHznwwdKcpliygtBbGP8Vk4w6OjkA+mv AjQRu/5AD+2wXrc7AUkxh2PnphLiWaw= X-Google-Smtp-Source: ABdhPJxkOKuriJCe7VCh3rWe55fPGsX3A3KT7uUTa+PcY5Rcy3XE1LD/FeCLedPOdFe/8b1/r+uapQ== X-Received: by 2002:a5d:63cd:: with SMTP id c13mr13916571wrw.224.1635515962054; Fri, 29 Oct 2021 06:59:22 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h18sm96066wre.46.2021.10.29.06.59.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 06:59:21 -0700 (PDT) Message-Id: <052197200141c321118b7766f5615a61f951e59f.1635515959.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 29 Oct 2021 13:59:13 +0000 Subject: [PATCH v3 2/8] test-tool genzeros: generate large amounts of data more efficiently Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin In this developer's tests, producing one gigabyte worth of NULs in a busy loop that writes out individual bytes, unbuffered, took ~27sec. Writing chunked 256kB buffers instead only took ~0.6sec This matters because we are about to introduce a pair of test cases that want to be able to produce 5GB of NULs, and we cannot use `/dev/zero` because of the HP NonStop platform's lack of support for that device. Signed-off-by: Johannes Schindelin --- t/helper/test-genzeros.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/t/helper/test-genzeros.c b/t/helper/test-genzeros.c index b1197e91a89..8ca988d6216 100644 --- a/t/helper/test-genzeros.c +++ b/t/helper/test-genzeros.c @@ -3,7 +3,10 @@ int cmd__genzeros(int argc, const char **argv) { + /* static, so that it is NUL-initialized */ + static const char zeros[256 * 1024]; intmax_t count; + ssize_t n; if (argc > 2) { fprintf(stderr, "usage: %s []\n", argv[0]); @@ -12,9 +15,19 @@ int cmd__genzeros(int argc, const char **argv) count = argc > 1 ? strtoimax(argv[1], NULL, 0) : -1; - while (count < 0 || count--) { - if (putchar(0) == EOF) + /* Writing out individual NUL bytes is slow... */ + while (count < 0) + if (write(1, zeros, ARRAY_SIZE(zeros)) < 0) return -1; + + while (count > 0) { + n = write(1, zeros, count < ARRAY_SIZE(zeros) ? + count : ARRAY_SIZE(zeros)); + + if (n < 0) + return -1; + + count -= n; } return 0; From patchwork Fri Oct 29 13:59:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= X-Patchwork-Id: 12592747 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51DFFC4332F for ; Fri, 29 Oct 2021 13:59:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3D77461100 for ; Fri, 29 Oct 2021 13:59:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231575AbhJ2OBz (ORCPT ); Fri, 29 Oct 2021 10:01:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231495AbhJ2OBw (ORCPT ); Fri, 29 Oct 2021 10:01:52 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10007C061570 for ; Fri, 29 Oct 2021 06:59:24 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id d3so16281378wrh.8 for ; Fri, 29 Oct 2021 06:59:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=IWZ+wQFISLCSuKsKparVaIZmh4E35bF5Bk/+BvEvB5Q=; b=XCQ47Tysg1anXPKhFA+GOMUz8jSeQvbX3eTAUn6PDXH1WzR5yU3gbAdw7dL95jfE1D cK0wBXozNf6PV5ZGVTH7dkFDaEV+CdeCJwCJvueBKO66BFju7Z4vVqXjFzhCdJrJnhlK ezKThtON9/RhKx5QGWN/LiQ5P0j1ueWPG4SAydt+DsbVV75/8lJb4XDHms9Z03QXv0ta AK5WaNZ962yIksA7gDSOxKcH9nZTavd0QkXMjZmvyiUIkdhSTg440+5z5DE1bwf9g2CM qCYcw7gD0lDMJXg+WTth01gQ9pD7d8j2kWHSbzOcNDt6WHg6ulzSM8OhQM29LwvaLDBH PBcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=IWZ+wQFISLCSuKsKparVaIZmh4E35bF5Bk/+BvEvB5Q=; b=3xTv2qMLm1Vqcxie1GPWW/wESpGcF4/O34w+YVtIb4yUY/bXLzfSNl1MXyY0CE34Ke bzIESuIs231qZC18PMQMqK93TWSxW12A3PXJi0JT5uD8u+ROrZeaFi6gM3ldiuibxObL pHCyQp6r0D1I3McfwN74nQ79sM4YgP5RW3mpZtYd7JWvUZlNRUJOTCTAachjKnw2QI5p TxNcR9rZaX9jf3DGqGImespuCEpp5+PkSOrTnav6Nf/e2qN6liWjqQWi0JRiBbYFntsB cpnHwm2W5Ddjf00SeM3SD6eCbLLvr4wQLsvh1xMPNQ+Bl5CiwFHmDQGrx2iN699sxMav ctlg== X-Gm-Message-State: AOAM533GsBOLhQXdeIejdND2It0A1Lch30YKmyB/T+gpeN44y05XquDp 67qmhV/xuF+qoUH0FJlY7dl8LX/VYhs= X-Google-Smtp-Source: ABdhPJx4DFDvl7RkGJP+F+AChRisNhVNWU//c8e/mMx4YeGvTs8O9Kt+A10NdM0BmZl87xt7/RsBNg== X-Received: by 2002:adf:d082:: with SMTP id y2mr6073304wrh.214.1635515962706; Fri, 29 Oct 2021 06:59:22 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n12sm1412647wmd.3.2021.10.29.06.59.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 06:59:22 -0700 (PDT) Message-Id: <489500bb1dcaffecab42672658990cfc26d52d7c.1635515959.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 29 Oct 2021 13:59:14 +0000 Subject: [PATCH v3 3/8] test-lib: add prerequisite for 64-bit platforms MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit platforms and regardless of the size of `long`. This imitates the `LONG_IS_64BIT` prerequisite. Signed-off-by: Carlo Marcelo Arenas Belón Signed-off-by: Johannes Schindelin --- t/test-lib.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index adaf03543e8..af1a94c2c20 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1642,6 +1642,10 @@ build_option () { sed -ne "s/^$1: //p" } +test_lazy_prereq SIZE_T_IS_64BIT ' + test 8 -eq "$(build_option sizeof-size_t)" +' + test_lazy_prereq LONG_IS_64BIT ' test 8 -le "$(build_option sizeof-long)" ' From patchwork Fri Oct 29 13:59:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Cooper X-Patchwork-Id: 12592749 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEC23C433F5 for ; Fri, 29 Oct 2021 13:59:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CE7BA6115C for ; Fri, 29 Oct 2021 13:59:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231628AbhJ2OB5 (ORCPT ); Fri, 29 Oct 2021 10:01:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231539AbhJ2OBx (ORCPT ); Fri, 29 Oct 2021 10:01:53 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBAEEC061714 for ; Fri, 29 Oct 2021 06:59:24 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id o14so16234861wra.12 for ; Fri, 29 Oct 2021 06:59:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=QBAXo5xOJ9+og0f0DdBUHfgUKw32ULT84hiuDsKSprk=; b=JnPMZ+3p00gDdDZONl15pRBJBlZvmiXLxF+nZGvnE3vP4XFx/riXgx+9A49DTkYSIs gF4tJjG1diOaqrQInuIIYXH2Ucj06K9FlRxqmMtt3gLwpFBR4NDCEkStaMS33JYVIfe9 A/QMqzxURKVs1hRVSzH1VTcR1lp6C9i70F1MX+gUPl6ztZKNjhaWBZveNSFxj6uxbixt 2y3tsPNygGP/8pML9Yr8DMZVLb5tZIGo60Ce2Rk/LXfhtnzhUO8bN9dyYOHIsGs+//jG c9kpVOuSAcTiiG8N1BIyVGrIBxJ6bR2KEYQ8jD/ApERW45kOkX+Lt9bQKs4vcTB3sT9U 9BgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=QBAXo5xOJ9+og0f0DdBUHfgUKw32ULT84hiuDsKSprk=; b=rZ3IYJsXWgz9gt+7iNcUYjqpkEx8lQ5YIr/I1QVPqjFnnfZeSyj3wda9b0CktGD5SI 5m2NCe4h3/J0WEu8jScIaHqTcuC919HzmDg9f002Z481fvrhXM/wNJt0iQLX1920/+8M 2aqrkwJHO4HLAFXVRY9SitEJCqRwJD1cdEFo4MKmBi2G/jjYngEusd3NoHxVax5XZJC4 F2yjC9HJUaWV5V9DeuMrLWtNVQ6McNWFnby1BZvAzNIOLuOQaGnHiL4R+FtWx0NqwJoG 7bA8BbZQuU3X6Lg41qJ23wo6LA0TF6PQ/PF9C7WEft5jUzbdTHYWUBegOAHalz8+/WBo Xy7g== X-Gm-Message-State: AOAM530hBto4rhCFAdFmPc2yjKxMvF+kO6rsvFW+jt+HOed2+tcIns2Q GyWfud7MjIcmxxVBitl9qsSAWgHBI4k= X-Google-Smtp-Source: ABdhPJxHkMKi8lA490YiyQAmUtKsOrcgHxdEtQBLpTSbMgNXIPk7lCRXd2L9k4ZmS1hA1O7/F68bVg== X-Received: by 2002:adf:f644:: with SMTP id x4mr14838479wrp.294.1635515963264; Fri, 29 Oct 2021 06:59:23 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h14sm9595304wmq.34.2021.10.29.06.59.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 06:59:22 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Fri, 29 Oct 2021 13:59:15 +0000 Subject: [PATCH v3 4/8] t1051: introduce a smudge filter test for extremely large files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Matt Cooper Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matt Cooper From: Matt Cooper The filter system allows for alterations to file contents when they're added to the database or workdir. ("Smudge" when moving to the workdir; "clean" when moving to the database.) This is used natively to handle CRLF to LF conversions. It's also employed by Git-LFS to replace large files from the workdir with small tracking files in the repo and vice versa. Git pulls the entire smudged file into memory. While this is inefficient, there's a more insidious problem on some platforms due to inconsistency between using unsigned long and size_t for the same type of data (size of a file in bytes). On most 64-bit platforms, unsigned long is 64 bits, and size_t is typedef'd to unsigned long. On Windows, however, unsigned long is only 32 bits (and therefore on 64-bit Windows, size_t is typedef'd to unsigned long long in order to be 64 bits). Practically speaking, this means 64-bit Windows users of Git-LFS can't handle files larger than 2^32 bytes. Other 64-bit platforms don't suffer this limitation. This commit introduces a test exposing the issue; future commits make it pass. The test simulates the way Git-LFS works by having a tiny file checked into the repository and expanding it to a huge file on checkout. Helped-by: Johannes Schindelin Signed-off-by: Matt Cooper Signed-off-by: Johannes Schindelin --- t/t1051-large-conversion.sh | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/t/t1051-large-conversion.sh b/t/t1051-large-conversion.sh index 8b7640b3ba8..bff86c13208 100755 --- a/t/t1051-large-conversion.sh +++ b/t/t1051-large-conversion.sh @@ -83,4 +83,18 @@ test_expect_success 'ident converts on output' ' test_cmp small.clean large.clean ' +# This smudge filter prepends 5GB of zeros to the file it checks out. This +# ensures that smudging doesn't mangle large files on 64-bit Windows. +test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB convert on output' ' + test_commit test small "a small file" && + test_config filter.makelarge.smudge \ + "test-tool genzeros $((5*1024*1024*1024)) && cat" && + echo "small filter=makelarge" >.gitattributes && + rm small && + git checkout -- small && + size=$(test_file_size small) && + test "$size" -ge $((5 * 1024 * 1024 * 1024)) +' + test_done From patchwork Fri Oct 29 13:59:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Cooper X-Patchwork-Id: 12592751 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E14DC433FE for ; Fri, 29 Oct 2021 13:59:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 287A761100 for ; Fri, 29 Oct 2021 13:59:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231680AbhJ2OCB (ORCPT ); Fri, 29 Oct 2021 10:02:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229803AbhJ2OBy (ORCPT ); Fri, 29 Oct 2021 10:01:54 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61CA7C061570 for ; Fri, 29 Oct 2021 06:59:25 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id s13so9145029wrb.3 for ; Fri, 29 Oct 2021 06:59:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1+ty6a4AULDHr7+j56mom2jNLWnzVhESh0nVpi+cBaY=; b=l9ogPCzP8Sl1gPVMmoFWx0dquH89AXDvorvbLEm2t11jSy/g6vxNN/tuLaWltfLKBk UkI2qV7wNnir0LZSqPiDMhe7yrn9M5w3xdRmA8b3ISnVpv5bOgAVev07NCF0LYuP7YPY UsL2OtRv1PlR7JxCi5z2THMcdB8VdnXb0uCw2iLYBOkl/UreGww8Xql+ZtKS20fsBiWa t2gRXYsV5iDxJd9uApLmGn8AjIiVjKTuTyRZ3xi95DE7XfVExBTn/XHaP99m0nQjTVj/ C8l7cNoYinf2HfI0taPP1c3ZynBN+j3F5AIyoygrSjRdYQiM78amd+hu9fFEbzEr6JpQ xaVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1+ty6a4AULDHr7+j56mom2jNLWnzVhESh0nVpi+cBaY=; b=tbV78RObFncUmKaE7gNwAqX3jBt8ihkYDYrILOrD0Wq4XxhGAeXHlDAIVRNQ24gDRO o909QTFsbfbPQV/BqtdX2N0PXOQZqP5BOqaDTfsvL9Dxl3KSg7EEJFzWdCnQEqoMKf/q qTj3XcbDi0JNAkBSKE3VXfCoh6hZZVng5RSUQVY1v1Cvonczy7A09elbJeonYz+JS9GW kCPbUcUFWP2OiFaPiiJxq9HSJ5emxBoTFbWaSK3P8W/56KqJplEmkoc3/ZXgG0Sg4DHr QRywome2NIT2/Rs4FI5l3ApDWlVJR4XMlwLAif4CWg6KA3vnuzoryh9X+6ezY2hKKwSi WgRw== X-Gm-Message-State: AOAM530BtO4CwvgpA9MEdc7uq0fuobu9skotzmb1bB3aXObES4TXlHL9 zzJIKgxMbqwtzpor9TxHMWVCQhRMNZU= X-Google-Smtp-Source: ABdhPJzVn85IS86kUOc2SWMo3PQWFbhHNIkSUDCCEDgBsrdtPTrvrFKhxJN8RItPfuXrCk8tLrbd6A== X-Received: by 2002:adf:e0cc:: with SMTP id m12mr14880858wri.62.1635515963982; Fri, 29 Oct 2021 06:59:23 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o23sm5686312wms.18.2021.10.29.06.59.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 06:59:23 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Fri, 29 Oct 2021 13:59:16 +0000 Subject: [PATCH v3 5/8] odb: teach read_blob_entry to use size_t Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Matt Cooper Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matt Cooper From: Matt Cooper There is mixed use of size_t and unsigned long to deal with sizes in the codebase. Recall that Windows defines unsigned long as 32 bits even on 64-bit platforms, meaning that converting size_t to unsigned long narrows the range. This mostly doesn't cause a problem since Git rarely deals with files larger than 2^32 bytes. But adjunct systems such as Git LFS, which use smudge/clean filters to keep huge files out of the repository, may have huge file contents passed through some of the functions in entry.c and convert.c. On Windows, this results in a truncated file being written to the workdir. I traced this to one specific use of unsigned long in write_entry (and a similar instance in write_pc_item_to_fd for parallel checkout). That appeared to be for the call to read_blob_entry, which expects a pointer to unsigned long. By altering the signature of read_blob_entry to expect a size_t, write_entry can be switched to use size_t internally (which all of its callers and most of its callees already used). To avoid touching dozens of additional files, read_blob_entry uses a local unsigned long to call a chain of functions which aren't prepared to accept size_t. Helped-by: Johannes Schindelin Signed-off-by: Matt Cooper Signed-off-by: Johannes Schindelin --- entry.c | 8 +++++--- entry.h | 2 +- parallel-checkout.c | 2 +- t/t1051-large-conversion.sh | 2 +- 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/entry.c b/entry.c index 711ee0693c7..4cb3942dbdc 100644 --- a/entry.c +++ b/entry.c @@ -82,11 +82,13 @@ static int create_file(const char *path, unsigned int mode) return open(path, O_WRONLY | O_CREAT | O_EXCL, mode); } -void *read_blob_entry(const struct cache_entry *ce, unsigned long *size) +void *read_blob_entry(const struct cache_entry *ce, size_t *size) { enum object_type type; - void *blob_data = read_object_file(&ce->oid, &type, size); + unsigned long ul; + void *blob_data = read_object_file(&ce->oid, &type, &ul); + *size = ul; if (blob_data) { if (type == OBJ_BLOB) return blob_data; @@ -270,7 +272,7 @@ static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca int fd, ret, fstat_done = 0; char *new_blob; struct strbuf buf = STRBUF_INIT; - unsigned long size; + size_t size; ssize_t wrote; size_t newsize = 0; struct stat st; diff --git a/entry.h b/entry.h index b8c0e170dc7..61ee8c17604 100644 --- a/entry.h +++ b/entry.h @@ -51,7 +51,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); */ void unlink_entry(const struct cache_entry *ce); -void *read_blob_entry(const struct cache_entry *ce, unsigned long *size); +void *read_blob_entry(const struct cache_entry *ce, size_t *size); int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st); void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, struct stat *st); diff --git a/parallel-checkout.c b/parallel-checkout.c index 6b1af32bb3d..b6f4a25642e 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -261,7 +261,7 @@ static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd, struct stream_filter *filter; struct strbuf buf = STRBUF_INIT; char *blob; - unsigned long size; + size_t size; ssize_t wrote; /* Sanity check */ diff --git a/t/t1051-large-conversion.sh b/t/t1051-large-conversion.sh index bff86c13208..8b23d862600 100755 --- a/t/t1051-large-conversion.sh +++ b/t/t1051-large-conversion.sh @@ -85,7 +85,7 @@ test_expect_success 'ident converts on output' ' # This smudge filter prepends 5GB of zeros to the file it checks out. This # ensures that smudging doesn't mangle large files on 64-bit Windows. -test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ 'files over 4GB convert on output' ' test_commit test small "a small file" && test_config filter.makelarge.smudge \ From patchwork Fri Oct 29 13:59:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin X-Patchwork-Id: 12592753 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D03EAC433EF for ; Fri, 29 Oct 2021 13:59:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BAAE36115C for ; Fri, 29 Oct 2021 13:59:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231696AbhJ2OCC (ORCPT ); Fri, 29 Oct 2021 10:02:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231328AbhJ2OBy (ORCPT ); Fri, 29 Oct 2021 10:01:54 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 091A4C061766 for ; Fri, 29 Oct 2021 06:59:26 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id p14so16254512wrd.10 for ; Fri, 29 Oct 2021 06:59:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=V3UJ7kXSNpQVGVtTGUng6sI3LTiP2EGs8G33fVBLwWI=; b=Q6OnzByLnjR8//4r2n4F/W9dTJYe99MkHA86u0P5fPMvyRcA9EVcZkKNCrAg/lPp+F Q1aDvc7mL9JgnotAK0CE9DBW58kbt1W6blAq0p4xaXhpG18rDYeD2i2say+P7HFOmWHh umWgcXUCOOGHaP+MrzkDew2hl+elglWXRhH6/PHUIOQVvaY3zLQMg4jV+1Ji6jJRUVIa m4btyffbb13QSdEAHt8LMvalbOpW1cuyGz2ukmg4fr7HAV4KEIE2hsyJFvKpWWZ9kHGI fbtw+aKf+RXC136O/BeQQKJSbFKkpedbel4GcL/7s1TtNh0wJXfKOzuUxY9UjkCM3TEc QzTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=V3UJ7kXSNpQVGVtTGUng6sI3LTiP2EGs8G33fVBLwWI=; b=xA1LrYjGZ+UQVTSKDj6JAuFPD8+gQPSbi4be2xinGVTlb668tLocMOwGdvdC5NNvFs uxz+0RO2n+/3V5On8KY1H179p9dv4x6rzYSXEp6tP0hZrUTdr3PjftbAAyqdNQaWvL0T r0g6O4QjZKxhB2GzhiVQ/hzPo4fidlOMQQJ929OngGuvX1wOIdCiapoqRJiTjpWRA9DL hlV4OxM5UFrrcL/nccUn+2uaNKl6iJT2qj246Ksuel5x80BtoH0LUL4ZCuYn/oNE5D1f SnwcCuqeya5aFJAoOqcqiiu00HT05ir6tBUuvNEuUkmqr2w3SrhMvTMldWiHTxU02GcX ehuA== X-Gm-Message-State: AOAM531d9tatsp/Pu6DgIoHprFAoqzAC8klnfo76thfyBdU6RurFJ81T h2PWovgy+sUT0sGVLDKBWrMszowEprQ= X-Google-Smtp-Source: ABdhPJzOoUYIn+KIWXJQ6gucNzOtfuQX4mLY2xEJPTH2VjfvsJyJKNpnfVGUtHlHVEXF2RouWH71Fg== X-Received: by 2002:adf:ed41:: with SMTP id u1mr14757111wro.346.1635515964698; Fri, 29 Oct 2021 06:59:24 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f133sm3220230wmf.31.2021.10.29.06.59.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 06:59:24 -0700 (PDT) Message-Id: <18419070c29aef85c266f01174f436566bf792fd.1635515959.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 29 Oct 2021 13:59:17 +0000 Subject: [PATCH v3 6/8] git-compat-util: introduce more size_t helpers Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Johannes Schindelin Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Schindelin From: Johannes Schindelin We will use them in the next commit. Signed-off-by: Johannes Schindelin --- git-compat-util.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index a508dbe5a35..1f41e5611a1 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -113,6 +113,14 @@ #define unsigned_mult_overflows(a, b) \ ((a) && (b) > maximum_unsigned_value_of_type(a) / (a)) +/* + * Returns true if the left shift of "a" by "shift" bits will + * overflow. The type of "a" must be unsigned. + */ +#define unsigned_left_shift_overflows(a, shift) \ + ((shift) < bitsizeof(a) && \ + (a) > maximum_unsigned_value_of_type(a) >> (shift)) + #ifdef __GNUC__ #define TYPEOF(x) (__typeof__(x)) #else @@ -859,6 +867,23 @@ static inline size_t st_sub(size_t a, size_t b) return a - b; } +static inline size_t st_left_shift(size_t a, unsigned shift) +{ + if (unsigned_left_shift_overflows(a, shift)) + die("size_t overflow: %"PRIuMAX" << %u", + (uintmax_t)a, shift); + return a << shift; +} + +static inline unsigned long cast_size_t_to_ulong(size_t a) +{ + if (a != (unsigned long)a) + die("object too large to read on this platform: %" + PRIuMAX" is cut off to %lu", + (uintmax_t)a, (unsigned long)a); + return (unsigned long)a; +} + #ifdef HAVE_ALLOCA_H # include # define xalloca(size) (alloca(size)) From patchwork Fri Oct 29 13:59:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Cooper X-Patchwork-Id: 12592755 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D729C433FE for ; Fri, 29 Oct 2021 13:59:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 17A3661167 for ; Fri, 29 Oct 2021 13:59:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231734AbhJ2OCF (ORCPT ); Fri, 29 Oct 2021 10:02:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231573AbhJ2OBz (ORCPT ); Fri, 29 Oct 2021 10:01:55 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95F4FC061767 for ; Fri, 29 Oct 2021 06:59:26 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id 71so6629155wma.4 for ; Fri, 29 Oct 2021 06:59:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=oUOtW+VMfF77w6KugTIk7kk1SNVq3GaMJS9+jVTFH/w=; b=aoY85z2ScqUWiujne/dtBtScAk9hA0ronrqLgP89suTWs8FZlqKqLo4kHM7rMN1J7z VSGxCadimkZuzRAmV4eI7a1iko8EWDrV1iWi3GIvLVQ2Hp3LPwE6Y0DIAFIn/Px5Z1GS rDgrAX4CMRBSXM695w3fsLvBIqcHEoqGTQrOO+Yo6FeTawLdwQxFVrfphyNjoA7KhEy9 4RE8ZEGhI6le5xDa5rJceyBUMG7yFD4BHWEwVDyY4sSwabaOQhmdb4Py6z/zEIxbO25Q BKhNmdN1+LS5qHb3tB2UDvdPN6tKkudYQL3aXK+sFCu2KZwV+kvFGq8THtdLu49LTsdz zL4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=oUOtW+VMfF77w6KugTIk7kk1SNVq3GaMJS9+jVTFH/w=; b=RWxegGkUZqNDWDtFNK+EJ8Dkm5ihADHkoGF5j5zWjb+R6DYDkWLQLEh1QEV3RijtuA /6Nx9zLOmhGI9mfkfHz+DUtcHBvOt/pLFc4YzKnAn9pNZfJfjXrdB9MGSf7y4AGPK1M1 9FfHNcb1B+27011DZsFaa7E0lw2qknQz5pRI7rkx7GhgxYGxD/hf73SkEEdAh5CwW8jp /ypOwDVJNyHLsOx09HhAA5afRJXJ+LUS1kjlkHz8ibb3+QxrexW3dmx3lm3A1MgVp5tW BZTWoLfvcf/gPO8IsExMsMM6m4r5LpoUP+lOpURrJspEXfWEIv1eLI2i7YxNNZOsmXGK Iw4w== X-Gm-Message-State: AOAM531vkV9RmW7jUVDwwK+j+tMzbb1pdF+Mhh3YTj7GESUcQC9LJe85 T/UqvA8c15zSIoFvtPaGHmYSTlEW10g= X-Google-Smtp-Source: ABdhPJz3Xmbv/65NnMAzFutCOtxHjen+jKi1So2QaXKxqk3H9dfmcNFmO+XhuQAga3xvjSJuhUaPdA== X-Received: by 2002:a1c:20cc:: with SMTP id g195mr5193487wmg.42.1635515965251; Fri, 29 Oct 2021 06:59:25 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h17sm4237634wrp.34.2021.10.29.06.59.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 06:59:24 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Fri, 29 Oct 2021 13:59:18 +0000 Subject: [PATCH v3 7/8] odb: guard against data loss checking out a huge file Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Matt Cooper Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matt Cooper From: Matt Cooper This introduces an additional guard for platforms where `unsigned long` and `size_t` are not of the same size. If the size of an object in the database would overflow `unsigned long`, instead we now exit with an error. A complete fix will have to update _many_ other functions throughout the codebase to use `size_t` instead of `unsigned long`. It will have to be implemented at some stage. This commit puts in a stop-gap for the time being. Helped-by: Johannes Schindelin Signed-off-by: Matt Cooper Signed-off-by: Johannes Schindelin --- delta.h | 6 +++--- object-file.c | 6 +++--- packfile.c | 6 +++--- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/delta.h b/delta.h index 2df5fe13d95..8a56ec07992 100644 --- a/delta.h +++ b/delta.h @@ -90,15 +90,15 @@ static inline unsigned long get_delta_hdr_size(const unsigned char **datap, const unsigned char *top) { const unsigned char *data = *datap; - unsigned long cmd, size = 0; + size_t cmd, size = 0; int i = 0; do { cmd = *data++; - size |= (cmd & 0x7f) << i; + size |= st_left_shift(cmd & 0x7f, i); i += 7; } while (cmd & 0x80 && data < top); *datap = data; - return size; + return cast_size_t_to_ulong(size); } #endif diff --git a/object-file.c b/object-file.c index f233b440b22..70e456fc2a3 100644 --- a/object-file.c +++ b/object-file.c @@ -1344,7 +1344,7 @@ static int parse_loose_header_extended(const char *hdr, struct object_info *oi, unsigned int flags) { const char *type_buf = hdr; - unsigned long size; + size_t size; int type, type_len = 0; /* @@ -1388,12 +1388,12 @@ static int parse_loose_header_extended(const char *hdr, struct object_info *oi, if (c > 9) break; hdr++; - size = size * 10 + c; + size = st_add(st_mult(size, 10), c); } } if (oi->sizep) - *oi->sizep = size; + *oi->sizep = cast_size_t_to_ulong(size); /* * The length must be followed by a zero byte diff --git a/packfile.c b/packfile.c index 755aa7aec5e..3ccea004396 100644 --- a/packfile.c +++ b/packfile.c @@ -1059,7 +1059,7 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep) { unsigned shift; - unsigned long size, c; + size_t size, c; unsigned long used = 0; c = buf[used++]; @@ -1073,10 +1073,10 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf, break; } c = buf[used++]; - size += (c & 0x7f) << shift; + size = st_add(size, st_left_shift(c & 0x7f, shift)); shift += 7; } - *sizep = size; + *sizep = cast_size_t_to_ulong(size); return used; } From patchwork Fri Oct 29 13:59:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Cooper X-Patchwork-Id: 12592757 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 807AEC433F5 for ; Fri, 29 Oct 2021 13:59:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6284C61100 for ; Fri, 29 Oct 2021 13:59:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231641AbhJ2OCG (ORCPT ); Fri, 29 Oct 2021 10:02:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231673AbhJ2OCB (ORCPT ); Fri, 29 Oct 2021 10:02:01 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 59A82C06120E for ; Fri, 29 Oct 2021 06:59:29 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id o14so16235237wra.12 for ; Fri, 29 Oct 2021 06:59:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=68yNxEpEa4kPiZewgYtxeNWEbIsC/efET42mXCcd9W0=; b=CIYKMwJgWrP4Lk+uH9cOlO1RVRY+DMk/Y1IMYZw03+VPdbJomGI9w3/fzWFxAD4Pxb yHDdtACxthNajIirZiIBTugDKJqjJu9xEupsavsYlnhIdMfh6xDx6wMTzDf4bmvP4q2z PHyxEEUZcu9Ub3WlzrmXbayJkuElBKBTWhy9uyxomL6trv7Ke2WN3RMWBLxtDGVlGc2t cf+T7BnLnt/SSOcL4T0d5/EvZcOGPFf1SdFzEu8xiyEob3pxCU3zbRp4nidrrchnxC+R i9ir9egYa/vKAl6GILDomMVMIk3TNNkJAIGYmUUUmJuFL22UjT7BTD82rrq1wA4zSzUs SJDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=68yNxEpEa4kPiZewgYtxeNWEbIsC/efET42mXCcd9W0=; b=5dzI/OQ230BGp27blvVxdYtjuq5DzdVj227IFWXaoRg/DRFezCTKvH0umCL+9C7E44 waNibV1HvnTdjR+tlaw2CGOALOPV5mxG7LjPiY3Pkkpu2nzKFiA6bhC9snR4hY3nsPeV PmWesUtkmK4w8sMF31lSfguM9HqbX3s8JEh0I8z3ln0xnCe3wQwDcM1lcGvMAT9gcdKz jOlWJiNjE4/A+oKMUfdDjentamwZcvC9yKVNNLo2lsyfW6NwDT3YgwqrI2yeRdBReVZu +YQ4b14xCN8plZLKSjFeaqiyDbk54w3jXOswSdwYtvGogIgcKGywDFJ+z0BQF9IoislM gbvg== X-Gm-Message-State: AOAM533mak4YkGUvz3r3oI5yvNPcK++3twyiDj0ITQNPXG9hu94EI7Fd sXfveItOc0Fjyh79Q0GjQvyQwJaqqQM= X-Google-Smtp-Source: ABdhPJzuI2ZE3KZW3LoAYykYwncC3lankfx+M8Jk1qkzCZyAioeF945i/xp9E5DQn1f541cVdulifQ== X-Received: by 2002:adf:a2d4:: with SMTP id t20mr14490928wra.229.1635515968603; Fri, 29 Oct 2021 06:59:28 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l9sm5499383wms.40.2021.10.29.06.59.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 06:59:25 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Fri, 29 Oct 2021 13:59:19 +0000 Subject: [PATCH v3 8/8] clean/smudge: allow clean filters to process extremely large files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Carlo Arenas , "brian m. carlson" , Johannes Schindelin , Matt Cooper Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Matt Cooper From: Matt Cooper The filter system allows for alterations to file contents when they're moved between the database and the worktree. We already made sure that it is possible for smudge filters to produce contents that are larger than `unsigned long` can represent (which matters on systems where `unsigned long` is narrower than `size_t`, most notably 64-bit Windows). Now we make sure that clean filters can _consume_ contents that are larger than that. Note that this commit only allows clean filters' _input_ to be larger than can be represented by `unsigned long`. This change makes only a very minute dent into the much larger project to teach Git to use `size_t` instead of `unsigned long` wherever appropriate. Helped-by: Johannes Schindelin Signed-off-by: Matt Cooper Signed-off-by: Johannes Schindelin --- convert.c | 2 +- t/t1051-large-conversion.sh | 11 +++++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/convert.c b/convert.c index fd9c84b0257..5ad6dfc08a0 100644 --- a/convert.c +++ b/convert.c @@ -613,7 +613,7 @@ static int crlf_to_worktree(const char *src, size_t len, struct strbuf *buf, struct filter_params { const char *src; - unsigned long size; + size_t size; int fd; const char *cmd; const char *path; diff --git a/t/t1051-large-conversion.sh b/t/t1051-large-conversion.sh index 8b23d862600..d4cfe8bf5de 100755 --- a/t/t1051-large-conversion.sh +++ b/t/t1051-large-conversion.sh @@ -97,4 +97,15 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ test "$size" -ge $((5 * 1024 * 1024 * 1024)) ' +# This clean filter writes down the size of input it receives. By checking against +# the actual size, we ensure that cleaning doesn't mangle large files on 64-bit Windows. +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB convert on input' ' + test-tool genzeros $((5*1024*1024*1024)) >big && + test_config filter.checklarge.clean "wc -c >big.size" && + echo "big filter=checklarge" >.gitattributes && + git add big && + test $(test_file_size big) -eq $(cat big.size) +' + test_done