From patchwork Tue Jun 23 15:24:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11620923 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7DE816C1 for ; Tue, 23 Jun 2020 15:24:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6EEB620774 for ; Tue, 23 Jun 2020 15:24:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733003AbgFWPY5 (ORCPT ); Tue, 23 Jun 2020 11:24:57 -0400 Received: from cloud.peff.net ([104.130.231.41]:40152 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732885AbgFWPY5 (ORCPT ); Tue, 23 Jun 2020 11:24:57 -0400 Received: (qmail 11838 invoked by uid 109); 23 Jun 2020 15:24:57 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 23 Jun 2020 15:24:57 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 16938 invoked by uid 111); 23 Jun 2020 15:24:56 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 23 Jun 2020 11:24:56 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 23 Jun 2020 11:24:56 -0400 From: Jeff King To: git@vger.kernel.org Cc: Eric Sunshine , Junio C Hamano , Johannes Schindelin Subject: [PATCH 05/10] fast-export: stop storing lengths in anonymized hashmaps Message-ID: <20200623152456.GE1435482@coredump.intra.peff.net> References: <20200623152436.GA50925@coredump.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200623152436.GA50925@coredump.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that the anonymize_str() interface is restricted to NUL-terminated strings, there's no need for us to keep track of the length of each entry in the hashmap. This simplifies the code and saves a bit of memory. Note that we do still need to compare the stored results to partial strings passed in by the callers. We can do that by using hashmap's keydata feature to get the ptr/len pair into the comparison function, and then using strncmp(). Signed-off-by: Jeff King --- builtin/fast-export.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/builtin/fast-export.c b/builtin/fast-export.c index d8ea067630..5df2ada47d 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -121,23 +121,32 @@ static int has_unshown_parent(struct commit *commit) struct anonymized_entry { struct hashmap_entry hash; const char *orig; - size_t orig_len; const char *anon; - size_t anon_len; +}; + +struct anonymized_entry_key { + struct hashmap_entry hash; + const char *orig; + size_t orig_len; }; static int anonymized_entry_cmp(const void *unused_cmp_data, const struct hashmap_entry *eptr, const struct hashmap_entry *entry_or_key, - const void *unused_keydata) + const void *keydata) { const struct anonymized_entry *a, *b; a = container_of(eptr, const struct anonymized_entry, hash); - b = container_of(entry_or_key, const struct anonymized_entry, hash); + if (keydata) { + const struct anonymized_entry_key *key = keydata; + int equal = !strncmp(a->orig, key->orig, key->orig_len) && + !a->orig[key->orig_len]; + return !equal; + } - return a->orig_len != b->orig_len || - memcmp(a->orig, b->orig, a->orig_len); + b = container_of(entry_or_key, const struct anonymized_entry, hash); + return strcmp(a->orig, b->orig); } /* @@ -149,23 +158,22 @@ static const char *anonymize_str(struct hashmap *map, char *(*generate)(const char *, size_t), const char *orig, size_t len) { - struct anonymized_entry key, *ret; + struct anonymized_entry_key key; + struct anonymized_entry *ret; if (!map->cmpfn) hashmap_init(map, anonymized_entry_cmp, NULL, 0); hashmap_entry_init(&key.hash, memhash(orig, len)); key.orig = orig; key.orig_len = len; - ret = hashmap_get_entry(map, &key, hash, NULL); + ret = hashmap_get_entry(map, &key, hash, &key); if (!ret) { ret = xmalloc(sizeof(*ret)); hashmap_entry_init(&ret->hash, key.hash.hash); ret->orig = xmemdupz(orig, len); - ret->orig_len = len; ret->anon = generate(orig, len); - ret->anon_len = strlen(ret->anon); hashmap_put(map, &ret->hash); }