diff mbox

4.15.14 crash with iscsi target and dvd

Message ID 20180408160254.GA22869@animx.eu.org (mailing list archive)
State New, archived
Headers show

Commit Message

Wakko Warner April 8, 2018, 4:02 p.m. UTC
Bart Van Assche wrote:
> On Sat, 2018-04-07 at 12:53 -0400, Wakko Warner wrote:
> > Bart Van Assche wrote:
> > > On Thu, 2018-04-05 at 22:06 -0400, Wakko Warner wrote:
> > > > I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.
> > > > I added a dev_printk in scsi_print_command where the 2 if statements return.
> > > > Logs:
> > > > [  29.866415] sr 3:0:0:0: cmd->cmnd is NULL
> > > 
> > > That's something that should never happen. As one can see in
> > > scsi_setup_scsi_cmnd() and scsi_setup_fs_cmnd() both functions initialize
> > > that pointer. Since I have not yet been able to reproduce myself what you
> > > reported, would it be possible for you to bisect this issue? You will need
> > > to follow something like the following procedure (see also
> > > https://git-scm.com/docs/git-bisect):
> > 
> > After doing 3 successful compiles with good/bad, I got this error and was
> > not able to compile any more kernels:
> >   CC      scripts/mod/devicetable-offsets.s
> > scripts/mod/empty.c:1:0: error: code model kernel does not support PIC mode
> >  /* empty file to figure out endianness / word size */
> >  
> > scripts/mod/devicetable-offsets.c:1:0: error: code model kernel does not support PIC mode
> >  #include <linux/kbuild.h>
> >  
> > scripts/Makefile.build:153: recipe for target 'scripts/mod/devicetable-offsets.s' failed
> > 
> > I don't think it found the bad commit.
> 
> Have you tried to modify the kernel Makefile as indicated in the following
> e-mail? This should make the kernel build:
> 
> https://lists.ubuntu.com/archives/kernel-team/2016-May/077178.html

Thanks.  That helped.

I finished with git bisect.  Here's the output:
84c8590646d5b35804bac60eb58b145839b5893e is the first bad commit
commit 84c8590646d5b35804bac60eb58b145839b5893e
Author: Ming Lei <tom.leiming@gmail.com>
Date:   Fri Nov 11 20:05:32 2016 +0800

    target: avoid accessing .bi_vcnt directly
    
    When the bio is full, bio_add_pc_page() will return zero,
    so use this information tell when the bio is full.
    
    Also replace access to .bi_vcnt for pr_debug() with bio_segments().
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Ming Lei <tom.leiming@gmail.com>
    Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
    Signed-off-by: Jens Axboe <axboe@fb.com>

:040000 040000 a3ebbb71c52ee4eb8c3be4d033b81179211bf704 de39a328dbd1b18519946b3ad46d9302886e0dd0 M      drivers

I did a diff between HEAD^ and HEAD and manually patched the file from
4.15.14.  It's not an exact revert.  I'm running it now and it's working.
I'll do a better test later on.  Here's the patch:


I really appreciate your time and assistance with this.

Comments

Wakko Warner April 8, 2018, 4:15 p.m. UTC | #1
Wakko Warner wrote:
> Bart Van Assche wrote:
> > Have you tried to modify the kernel Makefile as indicated in the following
> > e-mail? This should make the kernel build:
> > 
> > https://lists.ubuntu.com/archives/kernel-team/2016-May/077178.html
> 
> Thanks.  That helped.
> 
> I finished with git bisect.  Here's the output:
> 84c8590646d5b35804bac60eb58b145839b5893e is the first bad commit
> commit 84c8590646d5b35804bac60eb58b145839b5893e
> Author: Ming Lei <tom.leiming@gmail.com>
> Date:   Fri Nov 11 20:05:32 2016 +0800
> 
>     target: avoid accessing .bi_vcnt directly
>     
>     When the bio is full, bio_add_pc_page() will return zero,
>     so use this information tell when the bio is full.
>     
>     Also replace access to .bi_vcnt for pr_debug() with bio_segments().
>     
>     Reviewed-by: Christoph Hellwig <hch@lst.de>
>     Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>     Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>     Signed-off-by: Jens Axboe <axboe@fb.com>
> 
> :040000 040000 a3ebbb71c52ee4eb8c3be4d033b81179211bf704 de39a328dbd1b18519946b3ad46d9302886e0dd0 M      drivers
> 
> I did a diff between HEAD^ and HEAD and manually patched the file from
> 4.15.14.  It's not an exact revert.  I'm running it now and it's working.
> I'll do a better test later on.  Here's the patch:
> 
> --- a/drivers/target/target_core_pscsi.c	2018-02-04 14:31:31.077316617 -0500
> +++ b/drivers/target/target_core_pscsi.c	2018-04-08 11:43:49.588641374 -0400
> @@ -915,7 +915,9 @@
>  					bio, page, bytes, off);
>  			pr_debug("PSCSI: bio->bi_vcnt: %d nr_vecs: %d\n",
>  				bio_segments(bio), nr_vecs);
> -			if (rc != bytes) {
> +			if (rc != bytes)
> +				goto fail;
> +			if (bio->bi_vcnt > nr_vecs) {
>  				pr_debug("PSCSI: Reached bio->bi_vcnt max:"
>  					" %d i: %d bio: %p, allocating another"
>  					" bio\n", bio->bi_vcnt, i, bio);
> 
> I really appreciate your time and assistance with this.

One thing I noticed after doing this is errors in the kernel log on the
initiator:
[9072625.181744] sr 26:0:0:0: [sr1] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
[9072625.181802] sr 26:0:0:0: [sr1] tag#0 Sense Key : 0x2 [current] 
[9072625.181835] sr 26:0:0:0: [sr1] tag#0 ASC=0x8 ASCQ=0x0 
[9072625.181866] sr 26:0:0:0: [sr1] tag#0 CDB: opcode=0x28 28 00 00 0a 81 22 00 00 80 00
[9072625.181919] blk_update_request: I/O error, dev sr1, sector 2753672

When doing the exact same thing on the target, no mention.  My patch may not
be right, but it doesn't cause an oops.

I'm going to try 4.16.1 and see what happens.
Bart Van Assche April 9, 2018, 9:30 p.m. UTC | #2
On Sun, 2018-04-08 at 12:02 -0400, Wakko Warner wrote:
> I finished with git bisect.  Here's the output:

> 84c8590646d5b35804bac60eb58b145839b5893e is the first bad commit

> commit 84c8590646d5b35804bac60eb58b145839b5893e

> Author: Ming Lei <tom.leiming@gmail.com>

> Date:   Fri Nov 11 20:05:32 2016 +0800

> 

>     target: avoid accessing .bi_vcnt directly

>     

>     When the bio is full, bio_add_pc_page() will return zero,

>     so use this information tell when the bio is full.

>     

>     Also replace access to .bi_vcnt for pr_debug() with bio_segments().

>     

>     Reviewed-by: Christoph Hellwig <hch@lst.de>

>     Signed-off-by: Ming Lei <tom.leiming@gmail.com>

>     Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

>     Signed-off-by: Jens Axboe <axboe@fb.com>

> 

> :040000 040000 a3ebbb71c52ee4eb8c3be4d033b81179211bf704 de39a328dbd1b18519946b3ad46d9302886e0dd0 M      drivers

> 

> I did a diff between HEAD^ and HEAD and manually patched the file from

> 4.15.14.  It's not an exact revert.  I'm running it now and it's working.

> I'll do a better test later on.  Here's the patch:

> 

> --- a/drivers/target/target_core_pscsi.c	2018-02-04 14:31:31.077316617 -0500

> +++ b/drivers/target/target_core_pscsi.c	2018-04-08 11:43:49.588641374 -0400

> @@ -915,7 +915,9 @@

>  					bio, page, bytes, off);

>  			pr_debug("PSCSI: bio->bi_vcnt: %d nr_vecs: %d\n",

>  				bio_segments(bio), nr_vecs);

> -			if (rc != bytes) {

> +			if (rc != bytes)

> +				goto fail;

> +			if (bio->bi_vcnt > nr_vecs) {

>  				pr_debug("PSCSI: Reached bio->bi_vcnt max:"

>  					" %d i: %d bio: %p, allocating another"

>  					" bio\n", bio->bi_vcnt, i, bio);


Hello Ming,

Can you have a look at this? The start of this e-mail thread is available at
https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg72574.html.

Thanks,

Bart.
diff mbox

Patch

--- a/drivers/target/target_core_pscsi.c	2018-02-04 14:31:31.077316617 -0500
+++ b/drivers/target/target_core_pscsi.c	2018-04-08 11:43:49.588641374 -0400
@@ -915,7 +915,9 @@ 
 					bio, page, bytes, off);
 			pr_debug("PSCSI: bio->bi_vcnt: %d nr_vecs: %d\n",
 				bio_segments(bio), nr_vecs);
-			if (rc != bytes) {
+			if (rc != bytes)
+				goto fail;
+			if (bio->bi_vcnt > nr_vecs) {
 				pr_debug("PSCSI: Reached bio->bi_vcnt max:"
 					" %d i: %d bio: %p, allocating another"
 					" bio\n", bio->bi_vcnt, i, bio);