diff mbox

btt: fix uninitialized err_lock

Message ID x49d13i9uui.fsf@segfault.boston.devel.redhat.com (mailing list archive)
State Accepted
Commit d08cd5e0eb63
Headers show

Commit Message

Jeff Moyer Dec. 13, 2017, 9:33 p.m. UTC
Hi,

When a sector mode namespace is initially created, the arena's err_lock
is not initialized.  If, on the other hand, the namespace already
exists, the mutex is initialized.  To fix the issue, I moved the mutex
initialization into the arena_alloc, which is called by both
discover_arenas and create_arenas.

This was discovered on an older kernel where mutex_trylock checks the
count to determine whether the lock is held.  Because the data structure
is kzalloc-d, that count was 0 (held), and I/O to the device would hang
forever waiting for the lock to be released (see btt_write_pg, for
example).  Current kernels have a different mutex implementation that
checks for a non-null owner, and so this doesn't show up as a problem.
If that lock were ever contended, it might cause issues, but you'd have
to be really unlucky, I think.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>

Comments

Verma, Vishal L Dec. 13, 2017, 9:38 p.m. UTC | #1
On Wed, 2017-12-13 at 16:33 -0500, Jeff Moyer wrote:
> Hi,
> 
> When a sector mode namespace is initially created, the arena's
> err_lock
> is not initialized.  If, on the other hand, the namespace already
> exists, the mutex is initialized.  To fix the issue, I moved the
> mutex
> initialization into the arena_alloc, which is called by both
> discover_arenas and create_arenas.
> 
> This was discovered on an older kernel where mutex_trylock checks the
> count to determine whether the lock is held.  Because the data
> structure
> is kzalloc-d, that count was 0 (held), and I/O to the device would
> hang
> forever waiting for the lock to be released (see btt_write_pg, for
> example).  Current kernels have a different mutex implementation that
> checks for a non-null owner, and so this doesn't show up as a
> problem.
> If that lock were ever contended, it might cause issues, but you'd
> have
> to be really unlucky, I think.
> 
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>

Ah, good find - the fix looks good to me.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>

> 
> diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
> index e949e33..5860f99 100644
> --- a/drivers/nvdimm/btt.c
> +++ b/drivers/nvdimm/btt.c
> @@ -630,6 +630,7 @@ static struct arena_info *alloc_arena(struct btt
> *btt, size_t size,
>  		return NULL;
>  	arena->nd_btt = btt->nd_btt;
>  	arena->sector_size = btt->sector_size;
> +	mutex_init(&arena->err_lock);
>  
>  	if (!size)
>  		return arena;
> @@ -758,7 +759,6 @@ static int discover_arenas(struct btt *btt)
>  		arena->external_lba_start = cur_nlba;
>  		parse_arena_meta(arena, super, cur_off);
>  
> -		mutex_init(&arena->err_lock);
>  		ret = btt_freelist_init(arena);
>  		if (ret)
>  			goto out;
Jeff Moyer Jan. 15, 2018, 6:16 p.m. UTC | #2
Dan,

Any chance you can pull this one in?  (I know you've been busy.)

-Jeff

"Verma, Vishal L" <vishal.l.verma@intel.com> writes:

> On Wed, 2017-12-13 at 16:33 -0500, Jeff Moyer wrote:
>> Hi,
>> 
>> When a sector mode namespace is initially created, the arena's
>> err_lock
>> is not initialized.  If, on the other hand, the namespace already
>> exists, the mutex is initialized.  To fix the issue, I moved the
>> mutex
>> initialization into the arena_alloc, which is called by both
>> discover_arenas and create_arenas.
>> 
>> This was discovered on an older kernel where mutex_trylock checks the
>> count to determine whether the lock is held.  Because the data
>> structure
>> is kzalloc-d, that count was 0 (held), and I/O to the device would
>> hang
>> forever waiting for the lock to be released (see btt_write_pg, for
>> example).  Current kernels have a different mutex implementation that
>> checks for a non-null owner, and so this doesn't show up as a
>> problem.
>> If that lock were ever contended, it might cause issues, but you'd
>> have
>> to be really unlucky, I think.
>> 
>> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
>
> Ah, good find - the fix looks good to me.
> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
>
>> 
>> diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
>> index e949e33..5860f99 100644
>> --- a/drivers/nvdimm/btt.c
>> +++ b/drivers/nvdimm/btt.c
>> @@ -630,6 +630,7 @@ static struct arena_info *alloc_arena(struct btt
>> *btt, size_t size,
>>  		return NULL;
>>  	arena->nd_btt = btt->nd_btt;
>>  	arena->sector_size = btt->sector_size;
>> +	mutex_init(&arena->err_lock);
>>  
>>  	if (!size)
>>  		return arena;
>> @@ -758,7 +759,6 @@ static int discover_arenas(struct btt *btt)
>>  		arena->external_lba_start = cur_nlba;
>>  		parse_arena_meta(arena, super, cur_off);
>>  
>> -		mutex_init(&arena->err_lock);
>>  		ret = btt_freelist_init(arena);
>>  		if (ret)
>>  			goto out;
diff mbox

Patch

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index e949e33..5860f99 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -630,6 +630,7 @@  static struct arena_info *alloc_arena(struct btt *btt, size_t size,
 		return NULL;
 	arena->nd_btt = btt->nd_btt;
 	arena->sector_size = btt->sector_size;
+	mutex_init(&arena->err_lock);
 
 	if (!size)
 		return arena;
@@ -758,7 +759,6 @@  static int discover_arenas(struct btt *btt)
 		arena->external_lba_start = cur_nlba;
 		parse_arena_meta(arena, super, cur_off);
 
-		mutex_init(&arena->err_lock);
 		ret = btt_freelist_init(arena);
 		if (ret)
 			goto out;