From patchwork Sat Jul 29 13:36:53 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timofey Titovets X-Patchwork-Id: 9869947 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4C10C60382 for ; Sat, 29 Jul 2017 13:37:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 414EA1FF1F for ; Sat, 29 Jul 2017 13:37:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 363BC28885; Sat, 29 Jul 2017 13:37:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DAE901FF1F for ; Sat, 29 Jul 2017 13:37:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753816AbdG2NhL (ORCPT ); Sat, 29 Jul 2017 09:37:11 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:37855 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753778AbdG2NhH (ORCPT ); Sat, 29 Jul 2017 09:37:07 -0400 Received: by mail-wm0-f65.google.com with SMTP id t138so11589097wmt.4 for ; Sat, 29 Jul 2017 06:37:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=A1T2Gu1jZz2yKyoozRAgxHWUVK3MEpFXbRJ/kGF7+Jc=; b=LkEN7owOqLdmmhPpyIu9OPISe4RMR9PaZJYEHixrua/X639Z+0sjGo0ouHGK1qdaL1 MAeCjWOllJ0blyaBX9s6IA7idwxd2PQRVD3HSAgzwAYTibrXZ45nGEmeYMyjxWgEObEc YIPbYblTODkfu46ZCClR/0rXJpAW5eTJd2d3R75TFikCDvh4rA0gz5wt7tJR8wOPu58X JKIp33y+6kQhW08+ed61dcO13vIKK24Z4ddAw3w9umb/4Lrp/x9NOkpnN8n1S9IcqZfk CrFQpNetmnnE5lqtF3cSCz+3/SaDbCUIHVFPOJzVN/Mhi47vwdHXXBzG6hI8EjUXD9fz uOlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=A1T2Gu1jZz2yKyoozRAgxHWUVK3MEpFXbRJ/kGF7+Jc=; b=ILKb8ooxru4eR5eIP9t/76E3y6DBkaGc2VZ3fqMTl52KSVioDR22E4RDMjbigc6pUI KrXfYdV8D6a5mVoK49h6GrLFeyDHNPiHZglsyJG7O3NIO87sxcJYwQpnRimNDK77+jTd i6e06ryhDKlUkyR8y8FYD5zhMYOs7KSCi4PuZMKqy/oiEg6bNDpEe1H8wQMkbipmM0wv 9pz3FKM3fe0FRu5UhFtTTBeQi1IyXSH80INvvmJKC4eF0tHSO9VF7aqVhEsHdYNfi+f8 OrRphut/2PGWy4n0DGc+Cv6kcjPlACD8SeNatsMgsImQ0QkFue5AkjlV66GcyU+0MUfu PLSg== X-Gm-Message-State: AIVw110otJ/pM2su6WlugYlivYnD1Tb9bnhRPnHL+S9Jiuk7gTjPi2By fJ5NLRXHQGkhkWU/ X-Received: by 10.28.153.21 with SMTP id b21mr7658394wme.96.1501335426307; Sat, 29 Jul 2017 06:37:06 -0700 (PDT) Received: from localhost.localdomain (nat3-minsk-pool-46-53-180-190.telecom.by. [46.53.180.190]) by smtp.gmail.com with ESMTPSA id q18sm16306857wmd.44.2017.07.29.06.37.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 29 Jul 2017 06:37:05 -0700 (PDT) From: Timofey Titovets To: linux-btrfs@vger.kernel.org Cc: Timofey Titovets Subject: [PATCH v3 1/3] Btrfs: heuristic add simple sampling logic Date: Sat, 29 Jul 2017 16:36:53 +0300 Message-Id: <20170729133655.31260-2-nefelim4ag@gmail.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20170729133655.31260-1-nefelim4ag@gmail.com> References: <20170729133655.31260-1-nefelim4ag@gmail.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Get small sample from input data and calculate byte type count for that sample into bucket. Bucket will store info about which bytes and how many has been detected in sample Signed-off-by: Timofey Titovets --- fs/btrfs/compression.c | 24 ++++++++++++++++++++++-- fs/btrfs/compression.h | 10 ++++++++++ 2 files changed, 32 insertions(+), 2 deletions(-) -- 2.13.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 63f54bd2d5bb..ca7cfaad6e2f 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -1068,15 +1068,35 @@ int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end) u64 index = start >> PAGE_SHIFT; u64 end_index = end >> PAGE_SHIFT; struct page *page; - int ret = 1; + struct heuristic_bucket_item *bucket; + int a, b, ret; + u8 symbol, *input_data; + + ret = 1; + + bucket = kcalloc(BTRFS_HEURISTIC_BUCKET_SIZE, + sizeof(struct heuristic_bucket_item), GFP_NOFS); + + if (!bucket) + goto out; while (index <= end_index) { page = find_get_page(inode->i_mapping, index); - kmap(page); + input_data = kmap(page); + a = 0; + while (a < PAGE_SIZE) { + for (b = 0; b < BTRFS_HEURISTIC_READ_SIZE; b++) { + symbol = input_data[a+b]; + bucket[symbol].count++; + } + a += BTRFS_HEURISTIC_ITER_OFFSET; + } kunmap(page); put_page(page); index++; } +out: + kfree(bucket); return ret; } diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index d1f4eee2d0af..e30a9df1937e 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -129,6 +129,16 @@ struct btrfs_compress_op { extern const struct btrfs_compress_op btrfs_zlib_compress; extern const struct btrfs_compress_op btrfs_lzo_compress; +struct heuristic_bucket_item { + u8 padding; + u8 symbol; + u16 count; +}; + +#define BTRFS_HEURISTIC_READ_SIZE 16 +#define BTRFS_HEURISTIC_ITER_OFFSET 256 +#define BTRFS_HEURISTIC_BUCKET_SIZE 256 + int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end); #endif