From patchwork Fri Nov 15 14:15:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246411 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EB0B113BD for ; Fri, 15 Nov 2019 14:16:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CC40F2073A for ; Fri, 15 Nov 2019 14:16:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WrFUgNSL" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727443AbfKOOQC (ORCPT ); Fri, 15 Nov 2019 09:16:02 -0500 Received: from mail-wr1-f67.google.com ([209.85.221.67]:39668 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727411AbfKOOQC (ORCPT ); Fri, 15 Nov 2019 09:16:02 -0500 Received: by mail-wr1-f67.google.com with SMTP id l7so11149302wrp.6 for ; Fri, 15 Nov 2019 06:16:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qknXGnx0IXGyyYDk/MfYX2iTzXdGU5BdkNTZRccRtUA=; b=WrFUgNSL7J7ErMHJzRvQJrRwKpFgLnHrQB/xewC6GhI58xEucUfu3GonogZExfyiyK X+9eZzfO4QuniRD6Q9hxCtQ9ObgtWulytsz6qDolHkHiBZkpghY+46Km8lR3qwsuLXMv yheqiwyM0rB4mJg07ftgoiNu9oF+yjHwZGzDHFd3mQmINaIx2thUPlJKO4DBSPRJhGbT /bEpYLebAheGeaPUCSZ1C+L3dZckUop/HtlMLFaRD4iQsNqJIqET7h27L62PWTK0j3xy jiZq/No1vwI7Yik8J/yTlY3fnL2sGCxBGG2kr+/19diZYbVY6CW+RPna4BN23O7385Zd UGEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qknXGnx0IXGyyYDk/MfYX2iTzXdGU5BdkNTZRccRtUA=; b=NW34LWgAcUifYIOJBe8sQveHsk24Tr7gI+hJdiR39xL9LNy8CQmi9Mz3GGD9zhntAS /TnnCm4dacuzn6p8Kyjk2SmqFNJPWh3buS0Z791csSfUe1KnJzGd99uB8Pi9fIFY8UaP i0EQIqgFN/6+FnXZcEfeimSjqU/Ok25IgkBZj4yXT5DzUOcuVWJ2e+4BeAumkFiQnEDp VEgBjtK+LK/EiH7pvffFtTXm2fQ8AMip2X/d50KZcdroIADe87mHL1UGkBRAY0EhMvfo 3cc6DpizL8lvAkO1nn+qCxph8ap8/tqhMMnvtb6vDkXsUTW+fQImsacF6KuFLrp+/hpu mBZQ== X-Gm-Message-State: APjAAAV4K3NvvnVWGVfq7Fd1RxffXEhn44XT8SJnJKFBSIzIenTsaxfJ ECNXaUL6LPxzJ0WJiTKqlONiu2Df X-Google-Smtp-Source: APXvYqxcUn4PThRwmt5TJuUTjxys4aqvhTOa3CmH03rJF9AQDBRhCpN5PvGcd4qg6f1vAObIc1OtiQ== X-Received: by 2002:adf:f445:: with SMTP id f5mr16054876wrp.193.1573827359795; Fri, 15 Nov 2019 06:15:59 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.15.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:15:59 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan , James Ramsay Subject: [PATCH v3 1/9] builtin/pack-objects: report reused packfile objects Date: Fri, 15 Nov 2019 15:15:33 +0100 Message-Id: <20191115141541.11149-2-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King To see when packfile reuse kicks in or not, it is useful to show reused packfile objects statistics in the output of upload-pack. Helped-by: James Ramsay Signed-off-by: Jeff King Signed-off-by: Christian Couder --- builtin/pack-objects.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5876583220..f2c2703090 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3509,7 +3509,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (progress) fprintf_ln(stderr, _("Total %"PRIu32" (delta %"PRIu32")," - " reused %"PRIu32" (delta %"PRIu32")"), - written, written_delta, reused, reused_delta); + " reused %"PRIu32" (delta %"PRIu32")," + " pack-reused %"PRIu32), + written, written_delta, reused, reused_delta, + reuse_packfile_objects); return 0; } From patchwork Fri Nov 15 14:15:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246413 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D4E417EF for ; Fri, 15 Nov 2019 14:16:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1F20220733 for ; Fri, 15 Nov 2019 14:16:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cSrtY0Yi" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727461AbfKOOQD (ORCPT ); Fri, 15 Nov 2019 09:16:03 -0500 Received: from mail-wr1-f66.google.com ([209.85.221.66]:42091 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727380AbfKOOQD (ORCPT ); Fri, 15 Nov 2019 09:16:03 -0500 Received: by mail-wr1-f66.google.com with SMTP id a15so11136245wrf.9 for ; Fri, 15 Nov 2019 06:16:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9jtPV+sCn/emRq7GWYLrHWbMxGrAYFh/6kTAYGDHCUY=; b=cSrtY0YiqCWLqhu/DeOgtNHn87diN7tfHlTt4MZTXWltxZj8Nm8S4dmMaLJdrHf4a4 O624yijItPwmRstuUin/phJuyX3mHRjcwf+/3uJiXG7cdbAiJpLcZThJsTjTZLBqo0jw P9qAP5ixZjq/ea/BF1gY/+29eBSyk2h0efbA00URV2D/UJOZcWiHIR3IPeKbhO/MbTPk XMtsHxbP6QvUaMuKdxe/BmNE6b/cMq4diLAGFZvFKRp/FD1FakemwQpJKANDCpb/hy17 fM/FZvctdb7kHlm2O2YwdOkdE5pCrCdlaoBWm6/gYRLNhYJu+8FlP/cCWd/iCe5LC7Gh zGVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9jtPV+sCn/emRq7GWYLrHWbMxGrAYFh/6kTAYGDHCUY=; b=Tcv8PxmuTZWceZGj06UQKFrsC1ZgVzPDWyfVNOCMKkExzbl9BxkpryDgbyjLCmBX9b oW9vmv9krY4YEGU3ZueBa+NEv0jJKAxn0LOVBeitpkVNQJp1X+OMCoWeu9MoP0u6c3MB NEO+DSNzgQKPGtIGtEwZmqkFVGVNJkW8ef/lkJkQ+ry2YJXqcHh8RUigPyA+RW3f7DTH WT3tBDGYL2VCVJbHx2Ni0Op4Veg2Q28iN0MAt2itiyL3ZNCGMsS1elAHVwMrqANU7rnm h9pkk8MiZSdavV6oZf3qONFtCm3q3daIga5Zjc7PKSC0L7Xgz1VCrCkBhIe3Hqchwn9r l5tw== X-Gm-Message-State: APjAAAU+8fUyHuIjafkgs3iAV/HXjZlrfiZr6o1GTahcyyWLHmCfV/va ehUuGkLFzkfQtQE/uCoJ8bR7dn9C X-Google-Smtp-Source: APXvYqzUVUUODiba7wozoG8o/Ww3tUxixv5tYjXamAhJro0XA8W4Cy5ec+LlQLtOvHm6+N/ZMbN9kA== X-Received: by 2002:a05:6000:18e:: with SMTP id p14mr15441667wrx.98.1573827360962; Fri, 15 Nov 2019 06:16:00 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.15.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:16:00 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v3 2/9] packfile: expose get_delta_base() Date: Fri, 15 Nov 2019 15:15:34 +0100 Message-Id: <20191115141541.11149-3-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King In a following commit get_delta_base() will be used outside packfile.c, so let's make it non static and declare it in packfile.h. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- packfile.c | 10 +++++----- packfile.h | 3 +++ 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/packfile.c b/packfile.c index 355066de17..81e66847bf 100644 --- a/packfile.c +++ b/packfile.c @@ -1173,11 +1173,11 @@ const struct packed_git *has_packed_and_bad(struct repository *r, return NULL; } -static off_t get_delta_base(struct packed_git *p, - struct pack_window **w_curs, - off_t *curpos, - enum object_type type, - off_t delta_obj_offset) +off_t get_delta_base(struct packed_git *p, + struct pack_window **w_curs, + off_t *curpos, + enum object_type type, + off_t delta_obj_offset) { unsigned char *base_info = use_pack(p, w_curs, *curpos, NULL); off_t base_offset; diff --git a/packfile.h b/packfile.h index fc7904ec81..ec536a4ae5 100644 --- a/packfile.h +++ b/packfile.h @@ -151,6 +151,9 @@ void *unpack_entry(struct repository *r, struct packed_git *, off_t, enum object unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep); unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *); +off_t get_delta_base(struct packed_git *p, struct pack_window **w_curs, + off_t *curpos, enum object_type type, + off_t delta_obj_offset); void release_pack_memory(size_t); From patchwork Fri Nov 15 14:15:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246415 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C08BA1393 for ; Fri, 15 Nov 2019 14:16:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A113120730 for ; Fri, 15 Nov 2019 14:16:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MZE6FJ0f" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727514AbfKOOQG (ORCPT ); Fri, 15 Nov 2019 09:16:06 -0500 Received: from mail-wm1-f52.google.com ([209.85.128.52]:50608 "EHLO mail-wm1-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727380AbfKOOQG (ORCPT ); Fri, 15 Nov 2019 09:16:06 -0500 Received: by mail-wm1-f52.google.com with SMTP id l17so9805416wmh.0 for ; Fri, 15 Nov 2019 06:16:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SS7bRlq7DG8hvTPjGjIa+Gy8rxR6kTmDlHR4pTD7k/I=; b=MZE6FJ0fW7PHuD/EyZZwzYAvYwkqxbBR47LzwgMw7VG3qNoGfSJW3Ts8228Oop0Q9t U85bSz1QL2mCV3TSzkIPaEp3Zl9ygBaWm21IsSZs94UZUBa+NPLFP8WGzscBRR1xZ3vE nSXnGNupIs7z07cM6oFcELlu3hidApJbQBf+IR0fkhs5zlHMLOC8yRk5eFAeHIVL8T9s gkOJkBB0DgppYS7Gzj6sw7aTNVPWyBwNnsrf6/kbplwRA7I1N6tW3ehCq5p2GkXfXJhe SxhLmY4IbAR++rrkCX66/RtAg8fsWGitryFdfKO7J+AklMTi8zZOiRiJPQ6pI77yTK2m p6Aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SS7bRlq7DG8hvTPjGjIa+Gy8rxR6kTmDlHR4pTD7k/I=; b=mZwzzjbrsmW+15SymP0lOaMOWKx2HzwL5KhHDTIgwcWvf+jp5zUdO6L9gqeRialCOU HMGd7ABHTjrdgqQiQZN9VNwlHMPrWMES2G7H+Ft8A3ty+mrLKfxZeKPASa2lcvwwNMVQ zsHjEP8+OCBgr8cttOgS4Hct6yJ6hN8z37CA9dyMjAX+jApGyV8Wm6g7hjf4Hw3+6CFv 3eP83/tWPtb4EIgsnRNtERz9IsghRbZdk0EUuZMbnHji8peZ2qVMU7zTWaVuebTRjPAK VsCUpaEsrWHPKNsdLHgE1zpDZrCyaL0aZVmYW2LAjds6L/VugsOvYVxnSTlamw6HxH+l FTdw== X-Gm-Message-State: APjAAAXDM1GdnlSvFrxi0eCAdSGNnM+GSLxf8vPoZtEf8fbC7qU0ig+m H0BkJ1dhv6y08kyrdUQwbwgzTtCJ X-Google-Smtp-Source: APXvYqwrYTo+s7wj14I9vGjVx57tfJf2Rxvos1Tr9BXv4MbBHTwcQdI+Y9QnW0lYVLJKqidN/wVYDA== X-Received: by 2002:a1c:6a0d:: with SMTP id f13mr15485161wmc.164.1573827362169; Fri, 15 Nov 2019 06:16:02 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.16.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:16:01 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v3 3/9] ewah/bitmap: introduce bitmap_word_alloc() Date: Fri, 15 Nov 2019 15:15:35 +0100 Message-Id: <20191115141541.11149-4-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King In a following commit we will need to allocate a variable number of bitmap words, instead of always 32, so let's add bitmap_word_alloc() for this purpose. Helped-by: Jonathan Tan Signed-off-by: Jeff King Signed-off-by: Christian Couder --- ewah/bitmap.c | 13 +++++++++---- ewah/ewok.h | 1 + 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/ewah/bitmap.c b/ewah/bitmap.c index 52f1178db4..b5fed9621f 100644 --- a/ewah/bitmap.c +++ b/ewah/bitmap.c @@ -22,21 +22,26 @@ #define EWAH_MASK(x) ((eword_t)1 << (x % BITS_IN_EWORD)) #define EWAH_BLOCK(x) (x / BITS_IN_EWORD) -struct bitmap *bitmap_new(void) +struct bitmap *bitmap_word_alloc(size_t word_alloc) { struct bitmap *bitmap = xmalloc(sizeof(struct bitmap)); - bitmap->words = xcalloc(32, sizeof(eword_t)); - bitmap->word_alloc = 32; + bitmap->words = xcalloc(word_alloc, sizeof(eword_t)); + bitmap->word_alloc = word_alloc; return bitmap; } +struct bitmap *bitmap_new(void) +{ + return bitmap_word_alloc(32); +} + void bitmap_set(struct bitmap *self, size_t pos) { size_t block = EWAH_BLOCK(pos); if (block >= self->word_alloc) { size_t old_size = self->word_alloc; - self->word_alloc = block * 2; + self->word_alloc = block ? block * 2 : 1; REALLOC_ARRAY(self->words, self->word_alloc); memset(self->words + old_size, 0x0, (self->word_alloc - old_size) * sizeof(eword_t)); diff --git a/ewah/ewok.h b/ewah/ewok.h index 84b2a29faa..1b98b57c8b 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -172,6 +172,7 @@ struct bitmap { }; struct bitmap *bitmap_new(void); +struct bitmap *bitmap_word_alloc(size_t word_alloc); void bitmap_set(struct bitmap *self, size_t pos); int bitmap_get(struct bitmap *self, size_t pos); void bitmap_reset(struct bitmap *self); From patchwork Fri Nov 15 14:15:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246421 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 80E7A13BD for ; Fri, 15 Nov 2019 14:16:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 61FDB20730 for ; Fri, 15 Nov 2019 14:16:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JLfFGDr3" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727507AbfKOOQI (ORCPT ); Fri, 15 Nov 2019 09:16:08 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:42099 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727492AbfKOOQH (ORCPT ); Fri, 15 Nov 2019 09:16:07 -0500 Received: by mail-wr1-f68.google.com with SMTP id a15so11136412wrf.9 for ; Fri, 15 Nov 2019 06:16:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5afrFYSZvARCzze+Y+Vs4B6LG5LOEcJhymnQvlI5uWU=; b=JLfFGDr3JF7hxzON5xiYBofwuDxn72XUg0c+spDMHDTkygcWrCdzMZte4FvRQmiEPt AEFdQX5hIdV4grv1Ki6ktw361tWZhmDUgsOROYM0fXJ4j9avXB8QKlVwVNrRfo1UJOc2 I3WLy3RYJONg3uxZxpJ4S5STyhvmBH54x+NmzKBY/1c4xApSRAiopK6YUvNrpC3gUalR qB4cFGKXaiEYjXUB9vlMKFw5SoGrMtAvX5FZGPUln/uQed6eQKCrKO3zzYQUHS9dIsSQ tdxPF9LY//eftk5GZkTrIF9DWWLlK+PDtqr0glgPrRtYVdLFSjCKX4bAlW1OLQGjTluk W7Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5afrFYSZvARCzze+Y+Vs4B6LG5LOEcJhymnQvlI5uWU=; b=uZiBYYC0H0l/sMZ+ci1WDFlU8qXp+wBAVBnIet0C9PvydXOg509B8dU4fYkteCQ3oq ybeqLGGtOXpf87Phk4T0wYCYO/8iNSz4nWFs+HuL2JkiN8iwfS+/pGE7H9KdfGR4D2Uh FFDOHPFttbow0U1gCYZfN557FNv85V0le3euqCGPLRB9+ZMV+7Bl68ZDoWkWVmMbcIop t5fiLj/OQDnXhBFzYgvPlCuHRfp4MaCoOMlagSMosR2VHsDptq/15Qfef0AcXsch2Psz 0zUOBkCF3eywp/mUnchLIzXwefnF4Z+O+ANpJYx5bJjBE38SmaEp30aHcgMyUc7jt4dN 6Smg== X-Gm-Message-State: APjAAAVt1KiuQZAQ5AdkcNV1deKPqyM9pqoOKq3QqJGnI+yZPFtKBnoG 2Iz+Xdj0f5yYoXVY0imCwrNFZfIK X-Google-Smtp-Source: APXvYqyUNObYBbVwhZCXvVFRcdzNt5qfNFCiTwlpW6XcFeLP2vUJV+s9WRHIVA2rum9PVzjEwynj/A== X-Received: by 2002:adf:dc81:: with SMTP id r1mr16806176wrj.84.1573827363555; Fri, 15 Nov 2019 06:16:03 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.16.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:16:02 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v3 4/9] pack-bitmap: don't rely on bitmap_git->reuse_objects Date: Fri, 15 Nov 2019 15:15:36 +0100 Message-Id: <20191115141541.11149-5-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King We will no longer compute bitmap_git->reuse_objects in a following commit, so we cannot rely on it anymore to terminate the loop early; we have to iterate to the end. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- pack-bitmap.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index e07c798879..016d0319fc 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -622,7 +622,7 @@ static void show_objects_for_type( enum object_type object_type, show_reachable_fn show_reach) { - size_t pos = 0, i = 0; + size_t i = 0; uint32_t offset; struct ewah_iterator it; @@ -630,13 +630,15 @@ static void show_objects_for_type( struct bitmap *objects = bitmap_git->result; - if (bitmap_git->reuse_objects == bitmap_git->pack->num_objects) - return; - ewah_iterator_init(&it, type_filter); - while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) { + for (i = 0; i < objects->word_alloc && + ewah_iterator_next(&filter, &it); i++) { eword_t word = objects->words[i] & filter; + size_t pos = (i * BITS_IN_EWORD); + + if (!word) + continue; for (offset = 0; offset < BITS_IN_EWORD; ++offset) { struct object_id oid; @@ -648,9 +650,6 @@ static void show_objects_for_type( offset += ewah_bit_ctz64(word >> offset); - if (pos + offset < bitmap_git->reuse_objects) - continue; - entry = &bitmap_git->pack->revindex[pos + offset]; nth_packed_object_oid(&oid, bitmap_git->pack, entry->nr); @@ -659,9 +658,6 @@ static void show_objects_for_type( show_reach(&oid, object_type, 0, hash, bitmap_git->pack, entry->offset); } - - pos += BITS_IN_EWORD; - i++; } } From patchwork Fri Nov 15 14:15:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246417 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BCA031393 for ; Fri, 15 Nov 2019 14:16:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9DC7020733 for ; Fri, 15 Nov 2019 14:16:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ouGxWSvV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727528AbfKOOQH (ORCPT ); Fri, 15 Nov 2019 09:16:07 -0500 Received: from mail-wm1-f50.google.com ([209.85.128.50]:38548 "EHLO mail-wm1-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727420AbfKOOQG (ORCPT ); Fri, 15 Nov 2019 09:16:06 -0500 Received: by mail-wm1-f50.google.com with SMTP id z19so10594816wmk.3 for ; Fri, 15 Nov 2019 06:16:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Or9CZRa468VpWlF4CBSYitgmMNVyelgxDsvskSvosXQ=; b=ouGxWSvVBnI74siKjSzS5VCxO0EgZ4lTjxGQ9TFgml8j/ZfwqpDmlqWxx0EHeBC+O2 TaVrSS1aPQRHjq6amacJ9BWTxZy1HSGRjN2/DndUgWb/Hf7X5PdEoQmRrJ45yGc470Qy vf1OM1W8Ttns6QU3KM35l/pu3TswnNMoZYxaOPLsaNGae6P92oh3bKyT1CP32ecnViMq Hz4NWuDv4PwofjWicluxTIuJZqdy5xCLqV1ysuuMdrc+Xt2r9IwvlWOYio+fmEsz9Ohr rUMHd7qmYM8ka/U+002hRveIaTf0BEKTp4comJZKnqLsi+ASo6DJTnWIHGUWpuqK4sRA 07bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Or9CZRa468VpWlF4CBSYitgmMNVyelgxDsvskSvosXQ=; b=FD+cKo9Kr9yk0hMDVjYEuDY4FddlkNwL0MhAhZykTZn/vDnf7jIMlIYZ8w3rjQo4jh 9ZpzlluZwX6YWiPpHqItEJr+7TBTHPWOQq/j+obw5Zwdyu/0+b8vF75p9x8CGdFJ4rDa xPDPmDdGUD/H7NG0+ey4QB1qZdH40Bpi9wKLG9Ynqe/ZaplULFoBZnsc5UoVXyTaInnS 6hDehJV56kGcs+v5Nz5zISiBw7CT7rqACrig0YT4s5BjnnvrCEEN61jIW4DZWh2vtk6O rs2T+1m34HRgRPMOEEGCzWyzU6VSAEOpBEwhDwjo3kApOWIHE0U0mFXGu6r3+cjLOudR 5sHw== X-Gm-Message-State: APjAAAV++bBHHK2NqDmI8231y/Fcu5dMF+7zjPX9AmvAPKft5CsKTqsV gSykr8cVh3dM7KGHUPAVIQgd4tnR X-Google-Smtp-Source: APXvYqzw3HD5sD/zkioKt/v49+Kf9zaJBt2tvFYF8VEuxNccngTHqsApbUtk476bPGpnxzwYDZ5rRw== X-Received: by 2002:a1c:39c1:: with SMTP id g184mr14825536wma.75.1573827364642; Fri, 15 Nov 2019 06:16:04 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.16.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:16:04 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v3 5/9] pack-bitmap: introduce bitmap_walk_contains() Date: Fri, 15 Nov 2019 15:15:37 +0100 Message-Id: <20191115141541.11149-6-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King We will use this helper function in a following commit to tell us if an object is packed. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- pack-bitmap.c | 12 ++++++++++++ pack-bitmap.h | 3 +++ 2 files changed, 15 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index 016d0319fc..8a51302a1a 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -826,6 +826,18 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, return 0; } +int bitmap_walk_contains(struct bitmap_index *bitmap_git, + struct bitmap *bitmap, const struct object_id *oid) +{ + int idx; + + if (!bitmap) + return 0; + + idx = bitmap_position(bitmap_git, oid); + return idx >= 0 && bitmap_get(bitmap, idx); +} + void traverse_bitmap_commit_list(struct bitmap_index *bitmap_git, show_reachable_fn show_reachable) { diff --git a/pack-bitmap.h b/pack-bitmap.h index 466c5afa09..6ab6033dbe 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -3,6 +3,7 @@ #include "ewah/ewok.h" #include "khash.h" +#include "pack.h" #include "pack-objects.h" struct commit; @@ -53,6 +54,8 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *, int rebuild_existing_bitmaps(struct bitmap_index *, struct packing_data *mapping, kh_oid_map_t *reused_bitmaps, int show_progress); void free_bitmap_index(struct bitmap_index *); +int bitmap_walk_contains(struct bitmap_index *, + struct bitmap *bitmap, const struct object_id *oid); /* * After a traversal has been performed by prepare_bitmap_walk(), this can be From patchwork Fri Nov 15 14:15:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246419 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3E474930 for ; Fri, 15 Nov 2019 14:16:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1EF0B20730 for ; Fri, 15 Nov 2019 14:16:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fJH07Nsj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727539AbfKOOQJ (ORCPT ); Fri, 15 Nov 2019 09:16:09 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:35969 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727380AbfKOOQI (ORCPT ); Fri, 15 Nov 2019 09:16:08 -0500 Received: by mail-wr1-f65.google.com with SMTP id r10so11168931wrx.3 for ; Fri, 15 Nov 2019 06:16:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mRcPan6gwhuTjJc3zOP220A6CuVm8SCWShmN02+zsxc=; b=fJH07NsjnAFc0wvYT0Qumq0cA8shdOItBz0yk0fBotX2Y8Se/8v2Xt8eM9j5ob4Rr/ 8yz6FqUhpn9d6SNAMSZXm4sy6w5wETkn9XsabuqwJI3VztswooVKAhPIo5thzd/PzPBo /09HkQbotxlfPMxk3YfcT3xQgZjwHYlWyI3EmCYk8yKUciDk98LNAhZCsPjMC/YDGa3g WNMgD2QU/yDjp+uznQ5feD0Bz+D9BNXoumTonkmwNADrnboCOa3V8OUy5rByWeb3WR8Y siTALE9ND/JSWc+Og7WNOGHIZtnZntDdEIhczb9QW4nDr5iaO5usjbzAnn/RVSp7c1iK FAPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mRcPan6gwhuTjJc3zOP220A6CuVm8SCWShmN02+zsxc=; b=nPeTcz+GQw4ubG5T6LYUlwnRUz8YBeZwesx9+7LRqhCchH8E12SCt+qCEgOpEwjfDe Xe8GdaOn5RBjMdRY0mkQEsgbbqDVWnER0ecnSNigtphSHrjHiQwoCBzQ0uX8k7RuEa3A SfqiqjYVW/82Dqvv7cQDXN/DpqikJqSOiosW0dqMaEaOSx3eZJtP+3Dyg9vLyyFnvxD/ nQqfTm/vC5WxUWWhSigA0Gq50nVESjCmUbUyYaxe/A18QU/zx8zRKDRw3PCb+bYRQSy8 bSCleFNnLGC+x8WDN4pUxgbCiE2PHcYRMYNK4HxZYs3V3T5gOo2xWmNHS/MbJ9bMf1/+ XorQ== X-Gm-Message-State: APjAAAXqHtBOQr8rv8PsLRImtn7UecDZsc9a2j0sz9U3qi8n1eAegHsX Ff9m123yh0wDYFRwxdHbzI9TI3y/ X-Google-Smtp-Source: APXvYqwm5RhjrI324GVO6jOEY48yz0kHE151ROuK51APEMM9/mWGODoEBENKP0F17ytnQ7/H7jNMZQ== X-Received: by 2002:adf:fd45:: with SMTP id h5mr16942336wrs.388.1573827365853; Fri, 15 Nov 2019 06:16:05 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.16.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:16:05 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v3 6/9] csum-file: introduce hashfile_total() Date: Fri, 15 Nov 2019 15:15:38 +0100 Message-Id: <20191115141541.11149-7-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King We will need this helper function in a following commit to give us total number of bytes fed to the hashfile so far. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- csum-file.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/csum-file.h b/csum-file.h index a98b1eee53..f9cbd317fb 100644 --- a/csum-file.h +++ b/csum-file.h @@ -42,6 +42,15 @@ void hashflush(struct hashfile *f); void crc32_begin(struct hashfile *); uint32_t crc32_end(struct hashfile *); +/* + * Returns the total number of bytes fed to the hashfile so far (including ones + * that have not been written out to the descriptor yet). + */ +static inline off_t hashfile_total(struct hashfile *f) +{ + return f->total + f->offset; +} + static inline void hashwrite_u8(struct hashfile *f, uint8_t data) { hashwrite(f, &data, sizeof(data)); From patchwork Fri Nov 15 14:15:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246423 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8AA7D930 for ; Fri, 15 Nov 2019 14:16:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6AD1F20730 for ; Fri, 15 Nov 2019 14:16:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ViVz9y/2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727555AbfKOOQK (ORCPT ); Fri, 15 Nov 2019 09:16:10 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:38707 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727420AbfKOOQJ (ORCPT ); Fri, 15 Nov 2019 09:16:09 -0500 Received: by mail-wr1-f65.google.com with SMTP id i12so11150341wro.5 for ; Fri, 15 Nov 2019 06:16:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KIWLkMKJFKnIIAPn8O+rtAi8qCbUQlu156Hxt6sNP0Q=; b=ViVz9y/28k/cLvnC5Pq8JulyKqVHM1Arbz2psuRyEY8k4+qfkPZlJf0cVVhYgmXDza GjZFDMi7mFoB0SlxfgtxOpGSVKrtf0HYrAzI0ywwbPbU4sXht0GvQzXIhOWz+433O7zY lEgtHKQqPoJ15irONmPy735nv03AWmk5GQxDHeO2QWYaW6uZN49LwObhcAdiA+qD/9OV pk636NI2aU5bvxAk8EzrEcF6rl3aHPjaOUjsX2XWiRECozFDuHDN/yPyKFd8jl3gQ4p9 BMJUImWYj13hsL5NlW1w9p86Arq90vaAfH/6QMVvqs4xzgMPkM2AXQ8UWVgd0jh7ggsN ZdLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KIWLkMKJFKnIIAPn8O+rtAi8qCbUQlu156Hxt6sNP0Q=; b=lP/9qRXzKZd2rGh7zgkFuH/qJnYqNHdxVQQr4qWp8dUAaAU74hPZWoNXc8PcFoOE2o iwSYebvJX4OjvDuvO1kSuvdmqoOAKunW0GGOgrB9bmNH2OxxedcfPYI9BYqwHtQDEP6n bfMAsGX6/s1rd3MJGeErMDyMSRarzXQhM66+wr/AqUGeu23C6WhQ5mApE5rPOWpuA2qh 4l1AFtsTysuKrh60pn+5KcRE9wvr+/7+l/LG4muUomhAJcHp5CFmq760Q3pHr6va0BCd xALAcQOBYhcLXAxqstNb5jKoskrw6vA50GJXWOlqM+JEam2tHuhP2i8nGOCWjVDh0It2 IeJA== X-Gm-Message-State: APjAAAVEdAPMCtg7DAD5A/Z1jJWmd1OUkhKbBBOu4n0R4mcvh/BGuLPr t4YfnKr9scvL4nvY35dFGabdjGxG X-Google-Smtp-Source: APXvYqzky/GCusMe8haXapGlmcGjcIC1uvM7xHlqO+HVskBie7zIKL5Q33VyLejISXoxc2A43jR5WQ== X-Received: by 2002:a05:6000:12c7:: with SMTP id l7mr15766509wrx.128.1573827366901; Fri, 15 Nov 2019 06:16:06 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.16.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:16:06 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v3 7/9] pack-objects: introduce pack.allowPackReuse Date: Fri, 15 Nov 2019 15:15:39 +0100 Message-Id: <20191115141541.11149-8-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Let's make it possible to configure if we want pack reuse or not. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- Documentation/config/pack.txt | 4 ++++ builtin/pack-objects.c | 8 +++++++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 1d66f0c992..58323a351f 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -27,6 +27,10 @@ Note that changing the compression level will not automatically recompress all existing objects. You can force recompression by passing the -F option to linkgit:git-repack[1]. +pack.allowPackReuse:: + When true, which is the default, Git will try to reuse parts + of existing packfiles when preparing new packfiles. + pack.island:: An extended regular expression configuring a set of delta islands. See "DELTA ISLANDS" in linkgit:git-pack-objects[1] diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f2c2703090..4fcfcf6097 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -96,6 +96,7 @@ static off_t reuse_packfile_offset; static int use_bitmap_index_default = 1; static int use_bitmap_index = -1; +static int allow_pack_reuse = 1; static enum { WRITE_BITMAP_FALSE = 0, WRITE_BITMAP_QUIET, @@ -2699,6 +2700,10 @@ static int git_pack_config(const char *k, const char *v, void *cb) use_bitmap_index_default = git_config_bool(k, v); return 0; } + if (!strcmp(k, "pack.allowpackreuse")) { + allow_pack_reuse = git_config_bool(k, v); + return 0; + } if (!strcmp(k, "pack.threads")) { delta_search_threads = git_config_int(k, v); if (delta_search_threads < 0) @@ -3030,7 +3035,8 @@ static void loosen_unused_packed_objects(void) */ static int pack_options_allow_reuse(void) { - return pack_to_stdout && + return allow_pack_reuse && + pack_to_stdout && allow_ofs_delta && !ignore_packed_keep_on_disk && !ignore_packed_keep_in_core && From patchwork Fri Nov 15 14:15:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246425 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 31ED3930 for ; Fri, 15 Nov 2019 14:16:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 11EA820730 for ; Fri, 15 Nov 2019 14:16:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dyxalhFB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727466AbfKOOQN (ORCPT ); Fri, 15 Nov 2019 09:16:13 -0500 Received: from mail-wr1-f67.google.com ([209.85.221.67]:42110 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727380AbfKOOQK (ORCPT ); Fri, 15 Nov 2019 09:16:10 -0500 Received: by mail-wr1-f67.google.com with SMTP id a15so11136732wrf.9 for ; Fri, 15 Nov 2019 06:16:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Fgbi3/EefnACkW8uEBLXho5/IwKreRzAADE3NdXRozg=; b=dyxalhFBiofs03Ha1dP5Lvf9xCVrzTLwpwqJu055U/9v+QxDhOFTEH+DWpP/Yt4ZFM M+IDK1+wAoxeh7XcwQW26b/IiNOFkvVCVJUdVLCG33YqKJ5LO8w22IArYZrAQjv4c6BD Zk7z+KoZChPfQnV/EyvErqoHIRSpwYYIU404bSjtbKJR6HEPnyKrt/MJIuerUQ/qpO3x lYK5GMwXqgn0Xfy1Nt9ekwRVIgPANB/ijJS13QWG0L5eWwJbJztgo4bp4a6cjdF2SpLZ FoXv0/jGUIrq4nGoyW/XFu5WZQC/73a9sd4OhVrLXwE+W7g58YJ0yaLBAQFkKIOvQpFj +g8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Fgbi3/EefnACkW8uEBLXho5/IwKreRzAADE3NdXRozg=; b=jAhOvWM0xBv1DSf1bMLKIzS9yxAQDQtZB8cYLSJidLkc0g55xphFj6rurvQiAmDZeS BPeRs5tCkuGpswd1YS0lc+hHw+x3YDei9IBFA4fWD4VPy8CIuu3V0lDw/thTckW2RUgf fJwN2RHpapz1ZCFcBmhgvli3r88EQ0dIN1zxpXHGY7k4L1bHdcUM61q9I0iP6BABLcbE umYE7nniV94i4xcUUWGY/Xa+hWF6Lt2hWTz6MOZr2phM+6R1CVd4ts6CEbRg0mtr6z8Z pu191YGgofIwIFfZ99amWlZk0w4SaL53v+EKPWXS1CcVdP4UoBq5hA2vfIN9gJkX4His mepg== X-Gm-Message-State: APjAAAVmiKvu8KjY6vEWEchTGWVfi/pHZnqDgPjyjQcU28ghiFqZHLX6 Kc8paMRPITs8w45FZGDstv4eOoQ6 X-Google-Smtp-Source: APXvYqyJ3Gitf2sb5ZoZmRowFTjcN55l1l9fUOBReLmykQfW28e5dD82NiURNaSbRrp6C5v7TrW+fA== X-Received: by 2002:adf:9e05:: with SMTP id u5mr16145935wre.239.1573827368420; Fri, 15 Nov 2019 06:16:08 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.16.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:16:07 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v3 8/9] builtin/pack-objects: introduce obj_is_packed() Date: Fri, 15 Nov 2019 15:15:40 +0100 Message-Id: <20191115141541.11149-9-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Let's refactor the way we check if an object is packed by introducing obj_is_packed(). This function is now a simple wrapper around packlist_find(), but it will evolve in a following commit. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- builtin/pack-objects.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 4fcfcf6097..08898331ef 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2553,6 +2553,11 @@ static void ll_find_deltas(struct object_entry **list, unsigned list_size, free(p); } +static int obj_is_packed(const struct object_id *oid) +{ + return !!packlist_find(&to_pack, oid); +} + static void add_tag_chain(const struct object_id *oid) { struct tag *tag; @@ -2564,7 +2569,7 @@ static void add_tag_chain(const struct object_id *oid) * it was included via bitmaps, we would not have parsed it * previously). */ - if (packlist_find(&to_pack, oid)) + if (obj_is_packed(oid)) return; tag = lookup_tag(the_repository, oid); @@ -2588,7 +2593,7 @@ static int add_ref_tag(const char *path, const struct object_id *oid, int flag, if (starts_with(path, "refs/tags/") && /* is a tag? */ !peel_ref(path, &peeled) && /* peelable? */ - packlist_find(&to_pack, &peeled)) /* object packed? */ + obj_is_packed(&peeled)) /* object packed? */ add_tag_chain(oid); return 0; } From patchwork Fri Nov 15 14:15:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11246427 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D946930 for ; Fri, 15 Nov 2019 14:16:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5213920730 for ; Fri, 15 Nov 2019 14:16:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kyL3iFme" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727579AbfKOOQO (ORCPT ); Fri, 15 Nov 2019 09:16:14 -0500 Received: from mail-wm1-f67.google.com ([209.85.128.67]:51673 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727420AbfKOOQN (ORCPT ); Fri, 15 Nov 2019 09:16:13 -0500 Received: by mail-wm1-f67.google.com with SMTP id q70so9811470wme.1 for ; Fri, 15 Nov 2019 06:16:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iUUvoQ1aEift+eLVxTiwXNOaFhlHuOcL4fLpMZ7le/I=; b=kyL3iFmeT1b3H1jUzKcSntI9i/kkIfDrCCn2ylGncXUibC+pKP2ZT52WaRPfGn5lNz 14gwasENrgKANWpuRxku1eK6z1aBlRTnXbyx/iR9/vWbIwNQimRty9q/+9xQJaWBErrB qYg3tQT1E1OXmcEHDoF4wZJZAcp49CqVvC3kL7aH349yFOhuDF7/jlfVsYfvY1A+O10B oi/PVa6XtJV9F85PcIN0hGwoElKdX+vw/LWv4j/vw5UiPfYLLqKu1aIEkvVqEj1ENZ5t LvimHdY7Ow8o2AvOeOn/pCh7aMHJn7/b65cdYy385w7OLe86hpJGy4RwS2hRjxmifLyp 8+ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iUUvoQ1aEift+eLVxTiwXNOaFhlHuOcL4fLpMZ7le/I=; b=ShHWf0ebEjfHqPv5oCh8JmOXLI0MmFF4MTgCCYIJ8ViLWb1IMyfo/GMgdOGTpalx8p 4QKSqOUt0YQ4g5b6ggzcCe4MUcXCni5bdrWP6+yut69nEnXtfqD+4JSyeiq6VQPtLYoP 67dwsVmifgFdKJSdnEnPdJ3WlXDiTifYRzr+m1VXqOuqUN66feOe5v1PmeSYkEAHnkm6 DVVGXpYEqkPh3X0sagI8q1qtlzzhBlvMqJ1hbhWQEelSHGxSOW2jBUPfOrxQnSivcZv+ UhHn7mSKsyUSChkhHtt5Ntfe0LS3zj5Gms2HkP3doFxV8sIOa+tcbFRwY1HFiVR6Lni8 SyMw== X-Gm-Message-State: APjAAAXAiQcxbtXCjlgTSfyazxV1RiE8AX/xOR/TitzWdrnctXqYXdfr jqp9eHl6v8fnDeA33/nN04/niM4q X-Google-Smtp-Source: APXvYqzjwFrUqqUr4xEqQveYCnguwO6wwN42p16mKuBCnNQv50sA86C3Lstbg4S4MIkA+J5cWkytsg== X-Received: by 2002:a1c:5fc4:: with SMTP id t187mr15638174wmb.142.1573827369596; Fri, 15 Nov 2019 06:16:09 -0800 (PST) Received: from localhost.localdomain ([2a04:cec0:1050:ac52:b4cd:f6a2:ba59:f1d4]) by smtp.gmail.com with ESMTPSA id a2sm7907874wrt.79.2019.11.15.06.16.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2019 06:16:09 -0800 (PST) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v3 9/9] pack-objects: improve partial packfile reuse Date: Fri, 15 Nov 2019 15:15:41 +0100 Message-Id: <20191115141541.11149-10-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0-rc1 In-Reply-To: <20191115141541.11149-1-chriscool@tuxfamily.org> References: <20191115141541.11149-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King The old code to reuse deltas from an existing packfile just tried to dump a whole segment of the pack verbatim. That's faster than the traditional way of actually adding objects to the packing list, but it didn't kick in very often. This new code is really going for a middle ground: do _some_ per-object work, but way less than we'd traditionally do. For instance, packing torvalds/linux on GitHub servers just now reused 6.5M objects, but only needed ~50k chunks. To implement this, we store the chunks of the packfile that we reuse in a dynamic array of `struct reused_chunk`, and we use a reuse_packfile_bitmap to speed up reusing parts of packfiles. The dynamic array of `struct reused_chunk` is useful because we need to know the accumulated offset due to missing objects. So without the array we'd end up having to walk over the revindex for that set of objects. The array is basically caching those accumulated offsets (for the parts we _do_ include), so we don't have to compute them repeatedly. Additional checks are added in have_duplicate_entry() and obj_is_packed() to avoid duplicate objects in the reuse bitmap. It was probably buggy to not have such a check before. If a client both asks for a tag by sha1 and specifies "include-tag", we may end up including the tag in the reuse bitmap (due to the first thing), and then later adding it to the packlist (due to the second). This results in duplicate objects in the pack, which git chokes on. We should notice that we are already including it when doing the include-tag portion, and avoid adding it to the packlist. The simplest place to fix this is right in add_ref_tag, where we could avoid peeling the tag at all if we know that we are already including it. However, I've pushed the check instead into have_duplicate_entry(). This fixes not only this case, but also means that we cannot have any similar problems lurking in other code. No tests, because git does not actually exhibit this "ask for it and also include-tag" behavior. We do one or the other on clone, depending on whether --single-branch is set. However, libgit2 does both. Helped-by: Jonathan Tan Signed-off-by: Jeff King Signed-off-by: Christian Couder --- builtin/pack-objects.c | 222 ++++++++++++++++++++++++++++++++--------- pack-bitmap.c | 150 ++++++++++++++++++++-------- pack-bitmap.h | 3 +- 3 files changed, 288 insertions(+), 87 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 08898331ef..64ab033923 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -92,7 +92,7 @@ static struct progress *progress_state; static struct packed_git *reuse_packfile; static uint32_t reuse_packfile_objects; -static off_t reuse_packfile_offset; +static struct bitmap *reuse_packfile_bitmap; static int use_bitmap_index_default = 1; static int use_bitmap_index = -1; @@ -785,57 +785,185 @@ static struct object_entry **compute_write_order(void) return wo; } -static off_t write_reused_pack(struct hashfile *f) + +/* + * A reused set of objects. All objects in a chunk have the same + * relative position in the original packfile and the generated + * packfile. + */ + +static struct reused_chunk { + /* The offset of the first object of this chunk in the original + * packfile. */ + off_t original; + /* The offset of the first object of this chunk in the generated + * packfile minus "original". */ + off_t difference; +} *reused_chunks; +static int reused_chunks_nr; +static int reused_chunks_alloc; + +static void record_reused_object(off_t where, off_t offset) { - unsigned char buffer[8192]; - off_t to_write, total; - int fd; + if (reused_chunks_nr && reused_chunks[reused_chunks_nr-1].difference == offset) + return; - if (!is_pack_valid(reuse_packfile)) - die(_("packfile is invalid: %s"), reuse_packfile->pack_name); + ALLOC_GROW(reused_chunks, reused_chunks_nr + 1, + reused_chunks_alloc); + reused_chunks[reused_chunks_nr].original = where; + reused_chunks[reused_chunks_nr].difference = offset; + reused_chunks_nr++; +} - fd = git_open(reuse_packfile->pack_name); - if (fd < 0) - die_errno(_("unable to open packfile for reuse: %s"), - reuse_packfile->pack_name); +/* + * Binary search to find the chunk that "where" is in. Note + * that we're not looking for an exact match, just the first + * chunk that contains it (which implicitly ends at the start + * of the next chunk. + */ +static off_t find_reused_offset(off_t where) +{ + int lo = 0, hi = reused_chunks_nr; + while (lo < hi) { + int mi = lo + ((hi - lo) / 2); + if (where == reused_chunks[mi].original) + return reused_chunks[mi].difference; + if (where < reused_chunks[mi].original) + hi = mi; + else + lo = mi + 1; + } - if (lseek(fd, sizeof(struct pack_header), SEEK_SET) == -1) - die_errno(_("unable to seek in reused packfile")); + /* + * The first chunk starts at zero, so we can't have gone below + * there. + */ + assert(lo); + return reused_chunks[lo-1].difference; +} + +static void write_reused_pack_one(size_t pos, struct hashfile *out, + struct pack_window **w_curs) +{ + off_t offset, next, cur; + enum object_type type; + unsigned long size; - if (reuse_packfile_offset < 0) - reuse_packfile_offset = reuse_packfile->pack_size - the_hash_algo->rawsz; + offset = reuse_packfile->revindex[pos].offset; + next = reuse_packfile->revindex[pos + 1].offset; - total = to_write = reuse_packfile_offset - sizeof(struct pack_header); + record_reused_object(offset, offset - hashfile_total(out)); - while (to_write) { - int read_pack = xread(fd, buffer, sizeof(buffer)); + cur = offset; + type = unpack_object_header(reuse_packfile, w_curs, &cur, &size); + assert(type >= 0); - if (read_pack <= 0) - die_errno(_("unable to read from reused packfile")); + if (type == OBJ_OFS_DELTA) { + off_t base_offset; + off_t fixup; + + unsigned char header[MAX_PACK_OBJECT_HEADER]; + unsigned len; + + base_offset = get_delta_base(reuse_packfile, w_curs, &cur, type, offset); + assert(base_offset != 0); + + /* Convert to REF_DELTA if we must... */ + if (!allow_ofs_delta) { + int base_pos = find_revindex_position(reuse_packfile, base_offset); + const unsigned char *base_sha1 = + nth_packed_object_sha1(reuse_packfile, + reuse_packfile->revindex[base_pos].nr); + + len = encode_in_pack_object_header(header, sizeof(header), + OBJ_REF_DELTA, size); + hashwrite(out, header, len); + hashwrite(out, base_sha1, 20); + copy_pack_data(out, reuse_packfile, w_curs, cur, next - cur); + return; + } - if (read_pack > to_write) - read_pack = to_write; + /* Otherwise see if we need to rewrite the offset... */ + fixup = find_reused_offset(offset) - + find_reused_offset(base_offset); + if (fixup) { + unsigned char ofs_header[10]; + unsigned i, ofs_len; + off_t ofs = offset - base_offset - fixup; - hashwrite(f, buffer, read_pack); - to_write -= read_pack; + len = encode_in_pack_object_header(header, sizeof(header), + OBJ_OFS_DELTA, size); + + i = sizeof(ofs_header) - 1; + ofs_header[i] = ofs & 127; + while (ofs >>= 7) + ofs_header[--i] = 128 | (--ofs & 127); + + ofs_len = sizeof(ofs_header) - i; + + hashwrite(out, header, len); + hashwrite(out, ofs_header + sizeof(ofs_header) - ofs_len, ofs_len); + copy_pack_data(out, reuse_packfile, w_curs, cur, next - cur); + return; + } + + /* ...otherwise we have no fixup, and can write it verbatim */ + } + + copy_pack_data(out, reuse_packfile, w_curs, offset, next - offset); +} + +static size_t write_reused_pack_verbatim(struct hashfile *out, + struct pack_window **w_curs) +{ + size_t pos = 0; + + while (pos < reuse_packfile_bitmap->word_alloc && + reuse_packfile_bitmap->words[pos] == (eword_t)~0) + pos++; + + if (pos) { + off_t to_write; + + written = (pos * BITS_IN_EWORD); + to_write = reuse_packfile->revindex[written].offset + - sizeof(struct pack_header); + + /* We're recording one chunk, not one object. */ + record_reused_object(sizeof(struct pack_header), 0); + hashflush(out); + copy_pack_data(out, reuse_packfile, w_curs, + sizeof(struct pack_header), to_write); - /* - * We don't know the actual number of objects written, - * only how many bytes written, how many bytes total, and - * how many objects total. So we can fake it by pretending all - * objects we are writing are the same size. This gives us a - * smooth progress meter, and at the end it matches the true - * answer. - */ - written = reuse_packfile_objects * - (((double)(total - to_write)) / total); display_progress(progress_state, written); } + return pos; +} - close(fd); - written = reuse_packfile_objects; - display_progress(progress_state, written); - return reuse_packfile_offset - sizeof(struct pack_header); +static void write_reused_pack(struct hashfile *f) +{ + size_t i = 0; + uint32_t offset; + struct pack_window *w_curs = NULL; + + if (allow_ofs_delta) + i = write_reused_pack_verbatim(f, &w_curs); + + for (; i < reuse_packfile_bitmap->word_alloc; ++i) { + eword_t word = reuse_packfile_bitmap->words[i]; + size_t pos = (i * BITS_IN_EWORD); + + for (offset = 0; offset < BITS_IN_EWORD; ++offset) { + if ((word >> offset) == 0) + break; + + offset += ewah_bit_ctz64(word >> offset); + write_reused_pack_one(pos + offset, f, &w_curs); + display_progress(progress_state, ++written); + } + } + + unuse_pack(&w_curs); } static const char no_split_warning[] = N_( @@ -868,11 +996,9 @@ static void write_pack_file(void) offset = write_pack_header(f, nr_remaining); if (reuse_packfile) { - off_t packfile_size; assert(pack_to_stdout); - - packfile_size = write_reused_pack(f); - offset += packfile_size; + write_reused_pack(f); + offset = hashfile_total(f); } nr_written = 0; @@ -1001,6 +1127,10 @@ static int have_duplicate_entry(const struct object_id *oid, { struct object_entry *entry; + if (reuse_packfile_bitmap && + bitmap_walk_contains(bitmap_git, reuse_packfile_bitmap, oid)) + return 1; + entry = packlist_find(&to_pack, oid); if (!entry) return 0; @@ -2555,7 +2685,9 @@ static void ll_find_deltas(struct object_entry **list, unsigned list_size, static int obj_is_packed(const struct object_id *oid) { - return !!packlist_find(&to_pack, oid); + return packlist_find(&to_pack, oid) || + (reuse_packfile_bitmap && + bitmap_walk_contains(bitmap_git, reuse_packfile_bitmap, oid)); } static void add_tag_chain(const struct object_id *oid) @@ -2661,6 +2793,7 @@ static void prepare_pack(int window, int depth) if (nr_deltas && n > 1) { unsigned nr_done = 0; + if (progress) progress_state = start_progress(_("Compressing objects"), nr_deltas); @@ -3042,7 +3175,6 @@ static int pack_options_allow_reuse(void) { return allow_pack_reuse && pack_to_stdout && - allow_ofs_delta && !ignore_packed_keep_on_disk && !ignore_packed_keep_in_core && (!local || !have_non_local_packs) && @@ -3059,7 +3191,7 @@ static int get_object_list_from_bitmap(struct rev_info *revs) bitmap_git, &reuse_packfile, &reuse_packfile_objects, - &reuse_packfile_offset)) { + &reuse_packfile_bitmap)) { assert(reuse_packfile_objects); nr_result += reuse_packfile_objects; display_progress(progress_state, nr_result); diff --git a/pack-bitmap.c b/pack-bitmap.c index 8a51302a1a..cbfc544411 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -326,6 +326,13 @@ static int load_pack_bitmap(struct bitmap_index *bitmap_git) munmap(bitmap_git->map, bitmap_git->map_size); bitmap_git->map = NULL; bitmap_git->map_size = 0; + + kh_destroy_oid_map(bitmap_git->bitmaps); + bitmap_git->bitmaps = NULL; + + kh_destroy_oid_pos(bitmap_git->ext_index.positions); + bitmap_git->ext_index.positions = NULL; + return -1; } @@ -764,65 +771,126 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs) return NULL; } -int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, - struct packed_git **packfile, - uint32_t *entries, - off_t *up_to) +static void try_partial_reuse(struct bitmap_index *bitmap_git, + size_t pos, + struct bitmap *reuse, + struct pack_window **w_curs) { + struct revindex_entry *revidx; + off_t offset; + enum object_type type; + unsigned long size; + + if (pos >= bitmap_git->pack->num_objects) + return; /* not actually in the pack */ + + revidx = &bitmap_git->pack->revindex[pos]; + offset = revidx->offset; + type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); + if (type < 0) + return; /* broken packfile, punt */ + + if (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA) { + off_t base_offset; + int base_pos; + + /* + * Find the position of the base object so we can look it up + * in our bitmaps. If we can't come up with an offset, or if + * that offset is not in the revidx, the pack is corrupt. + * There's nothing we can do, so just punt on this object, + * and the normal slow path will complain about it in + * more detail. + */ + base_offset = get_delta_base(bitmap_git->pack, w_curs, + &offset, type, revidx->offset); + if (!base_offset) + return; + base_pos = find_revindex_position(bitmap_git->pack, base_offset); + if (base_pos < 0) + return; + + /* + * We assume delta dependencies always point backwards. This + * lets us do a single pass, and is basically always true + * due to the way OFS_DELTAs work. You would not typically + * find REF_DELTA in a bitmapped pack, since we only bitmap + * packs we write fresh, and OFS_DELTA is the default). But + * let's double check to make sure the pack wasn't written with + * odd parameters. + */ + if (base_pos >= pos) + return; + + /* + * And finally, if we're not sending the base as part of our + * reuse chunk, then don't send this object either. The base + * would come after us, along with other objects not + * necessarily in the pack, which means we'd need to convert + * to REF_DELTA on the fly. Better to just let the normal + * object_entry code path handle it. + */ + if (!bitmap_get(reuse, base_pos)) + return; + } + /* - * Reuse the packfile content if we need more than - * 90% of its objects + * If we got here, then the object is OK to reuse. Mark it. */ - static const double REUSE_PERCENT = 0.9; + bitmap_set(reuse, pos); +} +int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, + struct packed_git **packfile_out, + uint32_t *entries, + struct bitmap **reuse_out) +{ struct bitmap *result = bitmap_git->result; - uint32_t reuse_threshold; - uint32_t i, reuse_objects = 0; + struct bitmap *reuse; + struct pack_window *w_curs = NULL; + size_t i = 0; + uint32_t offset; assert(result); - for (i = 0; i < result->word_alloc; ++i) { - if (result->words[i] != (eword_t)~0) { - reuse_objects += ewah_bit_ctz64(~result->words[i]); - break; - } - - reuse_objects += BITS_IN_EWORD; - } + while (i < result->word_alloc && result->words[i] == (eword_t)~0) + i++; -#ifdef GIT_BITMAP_DEBUG - { - const unsigned char *sha1; - struct revindex_entry *entry; + /* Don't mark objects not in the packfile */ + if (i > bitmap_git->pack->num_objects / BITS_IN_EWORD) + i = bitmap_git->pack->num_objects / BITS_IN_EWORD; - entry = &bitmap_git->reverse_index->revindex[reuse_objects]; - sha1 = nth_packed_object_sha1(bitmap_git->pack, entry->nr); + reuse = bitmap_word_alloc(i); + memset(reuse->words, 0xFF, i * sizeof(eword_t)); - fprintf(stderr, "Failed to reuse at %d (%016llx)\n", - reuse_objects, result->words[i]); - fprintf(stderr, " %s\n", hash_to_hex(sha1)); - } -#endif + for (; i < result->word_alloc; ++i) { + eword_t word = result->words[i]; + size_t pos = (i * BITS_IN_EWORD); - if (!reuse_objects) - return -1; + for (offset = 0; offset < BITS_IN_EWORD; ++offset) { + if ((word >> offset) == 0) + break; - if (reuse_objects >= bitmap_git->pack->num_objects) { - bitmap_git->reuse_objects = *entries = bitmap_git->pack->num_objects; - *up_to = -1; /* reuse the full pack */ - *packfile = bitmap_git->pack; - return 0; + offset += ewah_bit_ctz64(word >> offset); + try_partial_reuse(bitmap_git, pos + offset, reuse, &w_curs); + } } - reuse_threshold = bitmap_popcount(bitmap_git->result) * REUSE_PERCENT; + unuse_pack(&w_curs); - if (reuse_objects < reuse_threshold) + *entries = bitmap_popcount(reuse); + if (!*entries) { + bitmap_free(reuse); return -1; + } - bitmap_git->reuse_objects = *entries = reuse_objects; - *up_to = bitmap_git->pack->revindex[reuse_objects].offset; - *packfile = bitmap_git->pack; - + /* + * Drop any reused objects from the result, since they will not + * need to be handled separately. + */ + bitmap_and_not(result, reuse); + *packfile_out = bitmap_git->pack; + *reuse_out = reuse; return 0; } diff --git a/pack-bitmap.h b/pack-bitmap.h index 6ab6033dbe..bcd03b8993 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -50,7 +50,8 @@ void test_bitmap_walk(struct rev_info *revs); struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs); int reuse_partial_packfile_from_bitmap(struct bitmap_index *, struct packed_git **packfile, - uint32_t *entries, off_t *up_to); + uint32_t *entries, + struct bitmap **reuse_out); int rebuild_existing_bitmaps(struct bitmap_index *, struct packing_data *mapping, kh_oid_map_t *reused_bitmaps, int show_progress); void free_bitmap_index(struct bitmap_index *);