diff mbox

[v2,2/2] Btrfs: compression must free at least PAGE_SIZE

Message ID 20170520164953.7344-3-nefelim4ag@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Timofey Titovets May 20, 2017, 4:49 p.m. UTC
Btrfs already skip store of data where compression didn't free at least one byte.
So make logic better and make check that compression free at least one PAGE_SIZE,
because in another case it useless to store this data compressed

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
---
 fs/btrfs/lzo.c  | 5 ++++-
 fs/btrfs/zlib.c | 3 ++-
 2 files changed, 6 insertions(+), 2 deletions(-)

--
2.13.0
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Kai Krakow May 20, 2017, 5:14 p.m. UTC | #1
Am Sat, 20 May 2017 19:49:53 +0300
schrieb Timofey Titovets <nefelim4ag@gmail.com>:

> Btrfs already skip store of data where compression didn't free at
> least one byte. So make logic better and make check that compression
> free at least one PAGE_SIZE, because in another case it useless to
> store this data compressed
> 
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> ---
>  fs/btrfs/lzo.c  | 5 ++++-
>  fs/btrfs/zlib.c | 3 ++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
> index bd0b0938..7f38bc3c 100644
> --- a/fs/btrfs/lzo.c
> +++ b/fs/btrfs/lzo.c
> @@ -229,8 +229,11 @@ static int lzo_compress_pages(struct list_head
> *ws, in_len = min(bytes_left, PAGE_SIZE);
>  	}
> 
> -	if (tot_out > tot_in)
> +	/* Compression must save at least one PAGE_SIZE */
> +	if (tot_out + PAGE_SIZE => tot_in) {

Shouldn't this be ">" instead of ">=" (btw, I don't think => works)...

Given the case that tot_in is 8192, and tot_out is 4096, we saved a
complete page but 4096+4096 would still be equal to 8192.

The former logic only pretended that there is no point in compression
if we saved 0 bytes.

BTW: What's the smallest block size that btrfs stores? Is it always
PAGE_SIZE? I'm not familiar with btrfs internals...

> +		ret = -E2BIG;
>  		goto out;
> +	}
> 
>  	/* store the size of all chunks of compressed data */
>  	cpage_out = kmap(pages[0]);
> diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
> index 135b1082..2b04259b 100644
> --- a/fs/btrfs/zlib.c
> +++ b/fs/btrfs/zlib.c
> @@ -191,7 +191,8 @@ static int zlib_compress_pages(struct list_head
> *ws, goto out;
>  	}
> 
> -	if (workspace->strm.total_out >= workspace->strm.total_in) {
> +	/* Compression must save at least one PAGE_SIZE */
> +	if (workspace->strm.total_out + PAGE_SIZE >=
> workspace->strm.total_in) { ret = -E2BIG;

Same as above...

>  		goto out;
>  	}
> --
> 2.13.0
Timofey Titovets May 20, 2017, 6:30 p.m. UTC | #2
2017-05-20 20:14 GMT+03:00 Kai Krakow <hurikhan77@gmail.com>:
> Am Sat, 20 May 2017 19:49:53 +0300
> schrieb Timofey Titovets <nefelim4ag@gmail.com>:
>
>> Btrfs already skip store of data where compression didn't free at
>> least one byte. So make logic better and make check that compression
>> free at least one PAGE_SIZE, because in another case it useless to
>> store this data compressed
>>
>> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
>> ---
>>  fs/btrfs/lzo.c  | 5 ++++-
>>  fs/btrfs/zlib.c | 3 ++-
>>  2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
>> index bd0b0938..7f38bc3c 100644
>> --- a/fs/btrfs/lzo.c
>> +++ b/fs/btrfs/lzo.c
>> @@ -229,8 +229,11 @@ static int lzo_compress_pages(struct list_head
>> *ws, in_len = min(bytes_left, PAGE_SIZE);
>>       }
>>
>> -     if (tot_out > tot_in)
>> +     /* Compression must save at least one PAGE_SIZE */
>> +     if (tot_out + PAGE_SIZE => tot_in) {
>
> Shouldn't this be ">" instead of ">=" (btw, I don't think => works)...
(D'oh, typo, thanks)

> Given the case that tot_in is 8192, and tot_out is 4096, we saved a
> complete page but 4096+4096 would still be equal to 8192.
>
> The former logic only pretended that there is no point in compression
> if we saved 0 bytes.

Before my changes,
lzo: if compressed data use same or less space in compare to not
compressed -> save compressed version
zlib: if profit of compression will not save at least one byte -> not
compress that data

zlib logic is right, so i just did copy-paste that, but after my
changes this case where compressed size == input size doesn't valid
(i.e. it's ok then tot_out + PAGE_SIZE == tot_in),
so you right ">=" -> ">"

i will fix that and resend patch set, thanks.

> BTW: What's the smallest block size that btrfs stores? Is it always
> PAGE_SIZE? I'm not familiar with btrfs internals...
https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs
AFAIK btrfs works with storage and account data by PAGE_SIZEd block,
so it's must be usefull to check if compressed size will give as at
least one PAGE_SIZE space.

>
>> +             ret = -E2BIG;
>>               goto out;
>> +     }
>>
>>       /* store the size of all chunks of compressed data */
>>       cpage_out = kmap(pages[0]);
>> diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
>> index 135b1082..2b04259b 100644
>> --- a/fs/btrfs/zlib.c
>> +++ b/fs/btrfs/zlib.c
>> @@ -191,7 +191,8 @@ static int zlib_compress_pages(struct list_head
>> *ws, goto out;
>>       }
>>
>> -     if (workspace->strm.total_out >= workspace->strm.total_in) {
>> +     /* Compression must save at least one PAGE_SIZE */
>> +     if (workspace->strm.total_out + PAGE_SIZE >=
>> workspace->strm.total_in) { ret = -E2BIG;
>
> Same as above...
>
>>               goto out;
>>       }
>> --
>> 2.13.0
>
>
> --
> Regards,
> Kai
Duncan May 21, 2017, 1:38 a.m. UTC | #3
Timofey Titovets posted on Sat, 20 May 2017 21:30:47 +0300 as excerpted:

> 2017-05-20 20:14 GMT+03:00 Kai Krakow <hurikhan77@gmail.com>:
> 
>> BTW: What's the smallest block size that btrfs stores? Is it always
>> PAGE_SIZE? I'm not familiar with btrfs internals...

Thanks for asking the question. =:^)  I hadn't made the below conflicting 
patch sets association until I saw it.

> https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs

> AFAIK btrfs works with storage and account data by PAGE_SIZEd block,
> so it's must be usefull to check if compressed size will give as at
> least one PAGE_SIZE space.

[Not a dev, just a btrfs list regular and btrfs user.  If the devs say 
different...]

While AFAIK, btrfs now saves data in page-size blocks (tho note that if 
the entire file is under a block it may be inlined into metadata 
depending on various limits, at least one of which is configurable)...

There is a sub-page-size patch set that has already been thru several 
revision cycles and AFAIK, remains on the roadmap for eventual merge.

You'd need to check the list or ask the devs about current status, but 
it's quite possible the current code checking for at least one byte of 
compression savings, instead of the at least PAGE_SIZE bytes of savings 
that at present would make more sense, is in anticipation of that patch 
set being merged.

If the sub-page-size block patch-set is no longer anticipated to be 
merged in the relatively near future, it may be worth changing the 
compression profit checks to PAGE_SIZE minimum, as this patch set does, 
but the two patch sets clearly do conflict in concept, so merging this 
one is unlikely to be wise if the other one is still on track for merging.
Timofey Titovets May 21, 2017, 4:54 p.m. UTC | #4
2017-05-21 4:38 GMT+03:00 Duncan <1i5t5.duncan@cox.net>:
> Timofey Titovets posted on Sat, 20 May 2017 21:30:47 +0300 as excerpted:
>
>> 2017-05-20 20:14 GMT+03:00 Kai Krakow <hurikhan77@gmail.com>:
>>
>>> BTW: What's the smallest block size that btrfs stores? Is it always
>>> PAGE_SIZE? I'm not familiar with btrfs internals...
>
> Thanks for asking the question. =:^)  I hadn't made the below conflicting
> patch sets association until I saw it.
>
>> https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs
>
>> AFAIK btrfs works with storage and account data by PAGE_SIZEd block,
>> so it's must be usefull to check if compressed size will give as at
>> least one PAGE_SIZE space.
>
> [Not a dev, just a btrfs list regular and btrfs user.  If the devs say
> different...]
>
> While AFAIK, btrfs now saves data in page-size blocks (tho note that if
> the entire file is under a block it may be inlined into metadata
> depending on various limits, at least one of which is configurable)...
>
> There is a sub-page-size patch set that has already been thru several
> revision cycles and AFAIK, remains on the roadmap for eventual merge.
>
> You'd need to check the list or ask the devs about current status, but
> it's quite possible the current code checking for at least one byte of
> compression savings, instead of the at least PAGE_SIZE bytes of savings
> that at present would make more sense, is in anticipation of that patch
> set being merged.
>
> If the sub-page-size block patch-set is no longer anticipated to be
> merged in the relatively near future, it may be worth changing the
> compression profit checks to PAGE_SIZE minimum, as this patch set does,
> but the two patch sets clearly do conflict in concept, so merging this
> one is unlikely to be wise if the other one is still on track for merging.

Sorry, but i know about subpagesize-blocksize patch set, but i don't
understand where you see conflict?

Can you explain what you mean?

By PAGE_SIZE i mean fs cluster size in my patch set.
In the teory it's also possible to use magic constant like 4096, but
then this change will be useless for PowerPC64.

Also (AFAIK) most storage engines use 4KiB+ sector size
So (IMHO) it's bad practice use block size smaller then 4KiB and try
save less 4KiB).

As this is only decision logic change, this can't and will not break
anything in subpage patch set, and can only lead to false positives,
when good compressed data will not saved as compressed.

So if and when subpage patch set would merged, PAGE_SIZE should be
replaced with sector size, and all continue work correctly.

Thanks.
Roman Mamedov May 21, 2017, 5:30 p.m. UTC | #5
On Sun, 21 May 2017 19:54:05 +0300
Timofey Titovets <nefelim4ag@gmail.com> wrote:

> Sorry, but i know about subpagesize-blocksize patch set, but i don't
> understand where you see conflict?
> 
> Can you explain what you mean?
> 
> By PAGE_SIZE i mean fs cluster size in my patch set.

This appears to be exactly the conflict. Subpagesize blocksize patchset would
make it possible to use e.g. Btrfs with 4K block (cluster) size on a MIPS
machine with 64K-sized pages. Would your checking for PAGE_SIZE still be
correct then?

> So if and when subpage patch set would merged, PAGE_SIZE should be
> replaced with sector size, and all continue work correctly.

I guess Duncan's question was why not compare against block size from the get
go, rather than create more places for Chandan to scour through to eliminate
all "blocksize = pagesize" assumptions...
Timofey Titovets May 21, 2017, 10:32 p.m. UTC | #6
2017-05-21 20:30 GMT+03:00 Roman Mamedov <rm@romanrm.net>:
> On Sun, 21 May 2017 19:54:05 +0300
> Timofey Titovets <nefelim4ag@gmail.com> wrote:
>
>> Sorry, but i know about subpagesize-blocksize patch set, but i don't
>> understand where you see conflict?
>>
>> Can you explain what you mean?
>>
>> By PAGE_SIZE i mean fs cluster size in my patch set.
>
> This appears to be exactly the conflict. Subpagesize blocksize patchset would
> make it possible to use e.g. Btrfs with 4K block (cluster) size on a MIPS
> machine with 64K-sized pages. Would your checking for PAGE_SIZE still be
> correct then?

Nope, logic will be incorrect, logic does not allow compress blocks,
even if this will lead to profit, but btrfs at now and from 2009 can't
be mounted with different cluster size.

IMHO, so as i work with code from latest stable this not a big issue
(see below).

>> So if and when subpage patch set would merged, PAGE_SIZE should be
>> replaced with sector size, and all continue work correctly.
>
> I guess Duncan's question was why not compare against block size from the get
> go, rather than create more places for Chandan to scour through to eliminate
> all "blocksize = pagesize" assumptions...
> --
> With respect,
> Roman

I did not want to hurt anyone, but:
- I do it like this because:
  it's easy and safe for all other code and btrfs stability
  (Also i doesn't have enough Kernel/C expirience to make complex rework)
- If I try to export sector size to compression code,
  i should rework many code around, not just fix 2 lines of code,
  it's complex, it's unsafe, it's will break Chandan patches
  (because subpage block size do same things),
  it's convert small patch in a big patch series, and you know whats
happens with big patch series...
- I want make compression better now for all, not after several years,
if possible.
  (subpage patches didn't finished and merged from 2012, i.e. around five years)
- It's should not make a problem for merge Chandan patch set, IMHO,
because this not touch how btrfs work with disk and memory.
- AFAIK in V21 subpage patch set compression on machines with 64KiB
doesn't work as expected [1].
  So, subpage patch series for compression code should reworked,
doesn't matter will my patches merged or not.

1. https://www.spinics.net/lists/linux-btrfs/msg59393.html

Thanks.
Duncan May 22, 2017, 9:44 a.m. UTC | #7
Timofey Titovets posted on Mon, 22 May 2017 01:32:21 +0300 as excerpted:

> 2017-05-21 20:30 GMT+03:00 Roman Mamedov <rm@romanrm.net>:
>> On Sun, 21 May 2017 19:54:05 +0300 Timofey Titovets
>> <nefelim4ag@gmail.com> wrote:
>>
>>> Sorry, but i know about subpagesize-blocksize patch set, but i don't
>>> understand where you see conflict?
>>>
>>> Can you explain what you mean?
>>>
>>> By PAGE_SIZE i mean fs cluster size in my patch set.
>>
>> This appears to be exactly the conflict. Subpagesize blocksize patchset
>> would make it possible to use e.g. Btrfs with 4K block (cluster) size
>> on a MIPS machine with 64K-sized pages. Would your checking for
>> PAGE_SIZE still be correct then?
> 
> Nope

>> I guess Duncan's question was why not compare against block size from
>> the get go, rather than create more places for Chandan to scour through
>> to eliminate all "blocksize = pagesize" assumptions...

> - If I try to export sector size to compression code, [...]
>   it's convert small patch in a big patch series, and you know whats
> happens with big patch series...

Good point...

> - AFAIK in V21 subpage patch set compression on machines with 64KiB
> doesn't work as expected [1].
>   So, subpage patch series for compression code should reworked,
> doesn't matter will my patches merged or not.

I guess I'd just like to have Chandan's input here too.  It sounds like 
it shouldn't be too much more to deal with given the size and complexity 
of that set already, but just to know that he's aware of the it and that 
it's coordinated there so as not to just be making that journey 
unnecessarily harder than it is already.

[Not that I as a non-dev can really say anything, but it can't /hurt/ to 
already have his ack, when this gets reviewed by those whose decision 
/does/ count.]
kernel test robot May 23, 2017, 6:05 a.m. UTC | #8
Hi Timofey,

[auto build test ERROR on v4.9-rc8]
[also build test ERROR on next-20170522]
[cannot apply to btrfs/next]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Timofey-Titovets/Btrfs-lzo-c-pr_debug-deflate-lzo/20170523-110651
config: x86_64-kexec (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   fs/btrfs/lzo.c: In function 'lzo_compress_pages':
>> fs/btrfs/lzo.c:233:27: error: expected expression before '>' token
     if (tot_out + PAGE_SIZE => tot_in) {
                              ^

vim +233 fs/btrfs/lzo.c

   227			in_page = find_get_page(mapping, start >> PAGE_SHIFT);
   228			data_in = kmap(in_page);
   229			in_len = min(bytes_left, PAGE_SIZE);
   230		}
   231	
   232		/* Compression must save at least one PAGE_SIZE */
 > 233		if (tot_out + PAGE_SIZE => tot_in) {
   234			ret = -E2BIG;
   235			goto out;
   236		}

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
diff mbox

Patch

diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index bd0b0938..7f38bc3c 100644
--- a/fs/btrfs/lzo.c
+++ b/fs/btrfs/lzo.c
@@ -229,8 +229,11 @@  static int lzo_compress_pages(struct list_head *ws,
 		in_len = min(bytes_left, PAGE_SIZE);
 	}

-	if (tot_out > tot_in)
+	/* Compression must save at least one PAGE_SIZE */
+	if (tot_out + PAGE_SIZE => tot_in) {
+		ret = -E2BIG;
 		goto out;
+	}

 	/* store the size of all chunks of compressed data */
 	cpage_out = kmap(pages[0]);
diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
index 135b1082..2b04259b 100644
--- a/fs/btrfs/zlib.c
+++ b/fs/btrfs/zlib.c
@@ -191,7 +191,8 @@  static int zlib_compress_pages(struct list_head *ws,
 		goto out;
 	}

-	if (workspace->strm.total_out >= workspace->strm.total_in) {
+	/* Compression must save at least one PAGE_SIZE */
+	if (workspace->strm.total_out + PAGE_SIZE >= workspace->strm.total_in) {
 		ret = -E2BIG;
 		goto out;
 	}