diff mbox

[v5,4/6] Btrfs: heuristic add detection of zeroed sample

Message ID 20170823002650.3133-5-nefelim4ag@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Timofey Titovets Aug. 23, 2017, 12:26 a.m. UTC
Use memcmp for check sample data to zeroes.

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
 fs/btrfs/heuristic.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

--
2.14.1
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Diego Calleja Aug. 23, 2017, 5:55 p.m. UTC | #1
El miércoles, 23 de agosto de 2017 2:26:48 (CEST) Timofey Titovets escribió:
> +	for (i = 0; i < workspace->sample_size; i += sizeof(zero)) {
> +		if (memcmp(&workspace->sample[i], &zero, sizeof(zero)))
> +			return false;

Instead of just checking for 0, wouldn't it be a better idea to check
for any kind of repetitions?

As in, iterate over the sample and memcmp() each part of sample with
the previous one. The cost would be the same, and it would detect not
just zeros, but any kind of repeated data. Is there any reason I'm
missing for not doing this?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Timofey Titovets Aug. 23, 2017, 8:03 p.m. UTC | #2
2017-08-23 20:55 GMT+03:00 Diego Calleja <diegocg@gmail.com>:
> El miércoles, 23 de agosto de 2017 2:26:48 (CEST) Timofey Titovets escribió:
>> +     for (i = 0; i < workspace->sample_size; i += sizeof(zero)) {
>> +             if (memcmp(&workspace->sample[i], &zero, sizeof(zero)))
>> +                     return false;
>
> Instead of just checking for 0, wouldn't it be a better idea to check
> for any kind of repetitions?
>
> As in, iterate over the sample and memcmp() each part of sample with
> the previous one. The cost would be the same, and it would detect not
> just zeros, but any kind of repeated data. Is there any reason I'm
> missing for not doing this?

Thank you, i was not think about that,
That approach seems better, i will update the patch.
diff mbox

Patch

diff --git a/fs/btrfs/heuristic.c b/fs/btrfs/heuristic.c
index 5336638a3b7c..4557ea1db373 100644
--- a/fs/btrfs/heuristic.c
+++ b/fs/btrfs/heuristic.c
@@ -73,6 +73,21 @@  static struct list_head *heuristic_alloc_workspace(void)
 	return ERR_PTR(-ENOMEM);
 }

+static bool sample_zeroed(struct workspace *workspace)
+{
+	u32 i;
+	u8 zero[READ_SIZE];
+
+	memset(&zero, 0, sizeof(zero));
+
+	for (i = 0; i < workspace->sample_size; i += sizeof(zero)) {
+		if (memcmp(&workspace->sample[i], &zero, sizeof(zero)))
+			return false;
+	}
+
+	return true;
+}
+
 static int heuristic(struct list_head *ws, struct inode *inode,
 		     u64 start, u64 end)
 {
@@ -110,6 +125,9 @@  static int heuristic(struct list_head *ws, struct inode *inode,

 	workspace->sample_size = b;

+	if (sample_zeroed(workspace))
+		return 1;
+
 	memset(workspace->bucket, 0, sizeof(*workspace->bucket)*BUCKET_SIZE);

 	for (a = 0; a < workspace->sample_size; a++) {