diff mbox

target/user: Fix possible overwrite of t_data_sg's last iov[]

Message ID 1488260828-28167-1-git-send-email-lixiubo@cmss.chinamobile.com (mailing list archive)
State Superseded
Headers show

Commit Message

Xiubo Li Feb. 28, 2017, 5:47 a.m. UTC
From: Xiubo Li <lixiubo@cmss.chinamobile.com>

If there has BIDI data, its first iov[] will overwrite the last
iov[] for se_cmd->t_data_sg.

To fix this, we can just increase the iov pointer, but this may
introuduce a new memory leakage bug: If the se_cmd->data_length
and se_cmd->t_bidi_data_sg->length are all not aligned up to the
DATA_BLOCK_SIZE, the actual length needed maybe larger than just
sum of them.

So, this could be avoided by rounding all the data lengthes up
to DATA_BLOCK_SIZE.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
---
 drivers/target/target_core_user.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

Comments

Andy Grover March 3, 2017, 7 p.m. UTC | #1
On 02/27/2017 09:47 PM, lixiubo@cmss.chinamobile.com wrote:
> From: Xiubo Li <lixiubo@cmss.chinamobile.com>
>
> If there has BIDI data, its first iov[] will overwrite the last
> iov[] for se_cmd->t_data_sg.

(+CCing orig BIDI and data block code authors)

Yeah. It looks like because alloc_and_scatter_data_area() (hereafter 
"aasda") is called twice in the BIDI case, and both times iov_cnt is 0, 
the new_iov() call doesn't increment the iov ptr and the first bidi iov 
overwrites the last data iov. Maybe fix this by exiting aasda() with iov 
pointing at the next unused iov in the array? Probably also want to zero 
the iov.

> To fix this, we can just increase the iov pointer, but this may
> introuduce a new memory leakage bug: If the se_cmd->data_length
> and se_cmd->t_bidi_data_sg->length are all not aligned up to the
> DATA_BLOCK_SIZE, the actual length needed maybe larger than just
> sum of them.

This also sounds right. But the solution below rounds up both data and 
bidi lengths to DATA_BLOCK_SIZE separately. That's not quite right 
either, since the current code (once aasda is fixed) allows the last 
data and first bidi iovs to both have allocations from the same data 
block. Shouldn't we be rounding up the sum of data and bidi data_lengths 
to DATA_BLOCK_SIZE? Call this Option A.

Option B is we go with changing the implementation to always use a 
separate data block for BIDI data (BIDI cmds are rare so no big deal), 
but then also please look into simplifying code in aasda() and 
tcmu_queue_cmd_ring that may now be overly complex.

Thanks -- Regards -- Andy

>
> So, this could be avoided by rounding all the data lengthes up
> to DATA_BLOCK_SIZE.
>
> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
> ---
>  drivers/target/target_core_user.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index 2e33100..59a18fd 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -429,10 +429,11 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
>
>  	mb = udev->mb_addr;
>  	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
> -	data_length = se_cmd->data_length;
> +	data_length = round_up(se_cmd->data_length, DATA_BLOCK_SIZE);
>  	if (se_cmd->se_cmd_flags & SCF_BIDI) {
>  		BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
> -		data_length += se_cmd->t_bidi_data_sg->length;
> +		data_length += round_up(se_cmd->t_bidi_data_sg->length,
> +				DATA_BLOCK_SIZE);
>  	}
>  	if ((command_size > (udev->cmdr_size / 2)) ||
>  	    data_length > udev->data_size) {
> @@ -503,10 +504,14 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
>  	entry->req.iov_dif_cnt = 0;
>
>  	/* Handle BIDI commands */
> -	iov_cnt = 0;
> -	alloc_and_scatter_data_area(udev, se_cmd->t_bidi_data_sg,
> -		se_cmd->t_bidi_data_nents, &iov, &iov_cnt, false);
> -	entry->req.iov_bidi_cnt = iov_cnt;
> +	if (se_cmd->se_cmd_flags & SCF_BIDI) {
> +		iov_cnt = 0;
> +		iov++;
> +		alloc_and_scatter_data_area(udev, se_cmd->t_bidi_data_sg,
> +				se_cmd->t_bidi_data_nents, &iov, &iov_cnt,
> +				false);
> +		entry->req.iov_bidi_cnt = iov_cnt;
> +	}
>
>  	/* cmd's data_bitmap is what changed in process */
>  	bitmap_xor(tcmu_cmd->data_bitmap, old_bitmap, udev->data_bitmap,
>

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiubo Li March 6, 2017, 5:48 a.m. UTC | #2
> On 02/27/2017 09:47 PM, lixiubo@cmss.chinamobile.com wrote:
>> From: Xiubo Li <lixiubo@cmss.chinamobile.com>
>>
>> If there has BIDI data, its first iov[] will overwrite the last
>> iov[] for se_cmd->t_data_sg.
>
> (+CCing orig BIDI and data block code authors)
>
> Yeah. It looks like because alloc_and_scatter_data_area() (hereafter 
> "aasda") is called twice in the BIDI case, and both times iov_cnt is 
> 0, the new_iov() call doesn't increment the iov ptr and the first bidi 
> iov overwrites the last data iov.
Yes, it is.

> Maybe fix this by exiting aasda() with iov pointing at the next unused 
> iov in the array? 
May it shouldn't be the aasda()'s duty to increment the iov ptr.


> Probably also want to zero the iov.
>
>> To fix this, we can just increase the iov pointer, but this may
>> introuduce a new memory leakage bug: If the se_cmd->data_length
>> and se_cmd->t_bidi_data_sg->length are all not aligned up to the
>> DATA_BLOCK_SIZE, the actual length needed maybe larger than just
>> sum of them.
>
> This also sounds right. But the solution below rounds up both data and 
> bidi lengths to DATA_BLOCK_SIZE separately. That's not quite right 
> either, since the current code (once aasda is fixed) allows the last 
> data and first bidi iovs to both have allocations from the same data 
> block. Shouldn't we be rounding up the sum of data and bidi 
> data_lengths to DATA_BLOCK_SIZE? Call this Option A.
Maybe it not very clear of the commit comments. This fix will separate
data from bidi blocks in iovs. Won't allow using the same data block.

For the round_up() changes:

The total data_length is only used for checking the available freed data 
block
space.

For example, when the se_cmd->data_length = 1K and t_bidi_data_sg->length
= 2K. The orig total data_length == sum(1K + 2K) == 3K, so in
is_ring_space_avail(..., data_length == 3K), if there has only one freed 
data block
space available in the ring, it will return TRUE.

But after this fixed patch, it actually needs two freed data block 
space, and the
total data_length should be round_up(1K, _SIZE) + round_UP(2K,
_SIZE) == 8K (here is two data blocks). So when there is only one freed 
data block
space available in ring, it should return FALSE and will wait and retry.


> Option B is we go with changing the implementation to always use a 
> separate data block for BIDI data (BIDI cmds are rare so no big deal), 
> but then also please look into simplifying code in aasda() and 
> tcmu_queue_cmd_ring that may now be overly complex.
>
Yes, there still has other bugs like this one,  i will try to simplify 
the code then.

For the BIDI data, still hasn't been used by the tcmu-runner. Is any 
other consumer
using this?

Thanks,

BRs
Xiubo

> Thanks -- Regards -- Andy
>
>>
>> So, this could be avoided by rounding all the data lengthes up
>> to DATA_BLOCK_SIZE.
>>
>> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
>> ---
>>  drivers/target/target_core_user.c | 17 +++++++++++------
>>  1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/target/target_core_user.c 
>> b/drivers/target/target_core_user.c
>> index 2e33100..59a18fd 100644
>> --- a/drivers/target/target_core_user.c
>> +++ b/drivers/target/target_core_user.c
>> @@ -429,10 +429,11 @@ static bool is_ring_space_avail(struct tcmu_dev 
>> *udev, size_t cmd_size, size_t d
>>
>>      mb = udev->mb_addr;
>>      cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
>> -    data_length = se_cmd->data_length;
>> +    data_length = round_up(se_cmd->data_length, DATA_BLOCK_SIZE);
>>      if (se_cmd->se_cmd_flags & SCF_BIDI) {
>>          BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
>> -        data_length += se_cmd->t_bidi_data_sg->length;
>> +        data_length += round_up(se_cmd->t_bidi_data_sg->length,
>> +                DATA_BLOCK_SIZE);
>>      }
>>      if ((command_size > (udev->cmdr_size / 2)) ||
>>          data_length > udev->data_size) {
>> @@ -503,10 +504,14 @@ static bool is_ring_space_avail(struct tcmu_dev 
>> *udev, size_t cmd_size, size_t d
>>      entry->req.iov_dif_cnt = 0;
>>
>>      /* Handle BIDI commands */
>> -    iov_cnt = 0;
>> -    alloc_and_scatter_data_area(udev, se_cmd->t_bidi_data_sg,
>> -        se_cmd->t_bidi_data_nents, &iov, &iov_cnt, false);
>> -    entry->req.iov_bidi_cnt = iov_cnt;
>> +    if (se_cmd->se_cmd_flags & SCF_BIDI) {
>> +        iov_cnt = 0;
>> +        iov++;
>> +        alloc_and_scatter_data_area(udev, se_cmd->t_bidi_data_sg,
>> +                se_cmd->t_bidi_data_nents, &iov, &iov_cnt,
>> +                false);
>> +        entry->req.iov_bidi_cnt = iov_cnt;
>> +    }
>>
>>      /* cmd's data_bitmap is what changed in process */
>>      bitmap_xor(tcmu_cmd->data_bitmap, old_bitmap, udev->data_bitmap,
>>
>



--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Grover March 6, 2017, 6:09 p.m. UTC | #3
On 03/05/2017 09:48 PM, Xiubo Li wrote:
>> Maybe fix this by exiting aasda() with iov pointing at the next unused
>> iov in the array?
> May it shouldn't be the aasda()'s duty to increment the iov ptr.

Sure, your call.

[snip text where we agree current solution allows data block sharing,
and new approach does not, and we're both ok with that]

>> Option B is we go with changing the implementation to always use a
>> separate data block for BIDI data (BIDI cmds are rare so no big deal),
>> but then also please look into simplifying code in aasda() and
>> tcmu_queue_cmd_ring that may now be overly complex.
>>
> Yes, there still has other bugs like this one,  i will try to simplify
> the code then.

Great :)

> For the BIDI data, still hasn't been used by the tcmu-runner. Is any
> other consumer using this?

Well with kernel APIs you just never know if somebody's using it but 
just not saying anything. But given this bug, how could they?

Regards -- Andy

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilias Tsitsimpis March 9, 2017, 12:51 p.m. UTC | #4
Hi Andy, Xiubo,

On Fri, Mar 03, 2017 at 11:00AM, Andy Grover wrote:
> On 02/27/2017 09:47 PM, lixiubo@cmss.chinamobile.com wrote:
> > From: Xiubo Li <lixiubo@cmss.chinamobile.com>
> > 
> > If there has BIDI data, its first iov[] will overwrite the last
> > iov[] for se_cmd->t_data_sg.
> 
> (+CCing orig BIDI and data block code authors)
> 
> Yeah. It looks like because alloc_and_scatter_data_area() (hereafter
> "aasda") is called twice in the BIDI case, and both times iov_cnt is 0, the
> new_iov() call doesn't increment the iov ptr and the first bidi iov
> overwrites the last data iov. Maybe fix this by exiting aasda() with iov
> pointing at the next unused iov in the array? Probably also want to zero the
> iov.

Yes this is definitely a bug, thanks for catching this. This seems to
have been introduced around the time new_iov() was introduced, and hence
it has been broken since v4.6.

On Mon, Mar 06, 2017 at 10:09AM, Andy Grover wrote:
> On 03/05/2017 09:48 PM, Xiubo Li wrote:
> > For the BIDI data, still hasn't been used by the tcmu-runner. Is any
> > other consumer using this?
> 
> Well with kernel APIs you just never know if somebody's using it but just
> not saying anything. But given this bug, how could they?

I added support for BIDI commands initially, as I needed to have SCSI
OSD working over tcmu and the OSD protocol requires BIDI support and I
am definitely using it now. Unfortunately, I haven't been able to
closely follow the development of target/user lately, so I failed to
notice this breakage in time. Sorry about that.
Ilias Tsitsimpis March 9, 2017, 3:54 p.m. UTC | #5
On Thu, Mar 09, 2017 at 10:08PM, 李秀波 wrote:
> Any other advice about this change for your case?

I took a look at the patch you sent and it seems reasonable to me.
I didn't have the time to test it, but I will try to update to the
latest version (probably by next week) and report back.

Thanks for your work :)
diff mbox

Patch

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 2e33100..59a18fd 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -429,10 +429,11 @@  static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 
 	mb = udev->mb_addr;
 	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
-	data_length = se_cmd->data_length;
+	data_length = round_up(se_cmd->data_length, DATA_BLOCK_SIZE);
 	if (se_cmd->se_cmd_flags & SCF_BIDI) {
 		BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
-		data_length += se_cmd->t_bidi_data_sg->length;
+		data_length += round_up(se_cmd->t_bidi_data_sg->length,
+				DATA_BLOCK_SIZE);
 	}
 	if ((command_size > (udev->cmdr_size / 2)) ||
 	    data_length > udev->data_size) {
@@ -503,10 +504,14 @@  static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 	entry->req.iov_dif_cnt = 0;
 
 	/* Handle BIDI commands */
-	iov_cnt = 0;
-	alloc_and_scatter_data_area(udev, se_cmd->t_bidi_data_sg,
-		se_cmd->t_bidi_data_nents, &iov, &iov_cnt, false);
-	entry->req.iov_bidi_cnt = iov_cnt;
+	if (se_cmd->se_cmd_flags & SCF_BIDI) {
+		iov_cnt = 0;
+		iov++;
+		alloc_and_scatter_data_area(udev, se_cmd->t_bidi_data_sg,
+				se_cmd->t_bidi_data_nents, &iov, &iov_cnt,
+				false);
+		entry->req.iov_bidi_cnt = iov_cnt;
+	}
 
 	/* cmd's data_bitmap is what changed in process */
 	bitmap_xor(tcmu_cmd->data_bitmap, old_bitmap, udev->data_bitmap,