mbox series

[for-next,v4,0/5] null_blk: improve write failure simulation

Message ID 20250121081517.1212575-1-shinichiro.kawasaki@wdc.com (mailing list archive)
Headers show
Series null_blk: improve write failure simulation | expand

Message

Shin'ichiro Kawasaki Jan. 21, 2025, 8:15 a.m. UTC
Currently, null_blk has 'badblocks' parameter to simulate IO failures
for broken blocks. This helps checking if userland tools can handle IO
failures. However, this badblocks feature has two differences from the
IO failures on real storage devices. Firstly, when write operations fail
for the badblocks, null_blk does not write any data, while real storage
devices sometimes do partial data write. Secondly, null_blk always make
write operations fail for the specified badblocks, while real storage
devices can recover the bad blocks so that next write operations can
succeed after failure. Hence, real storage devices are required to check
if userland tools support such partial writes or bad blocks recovery.

This series improves write failure simulation by null_blk to allow
checking userland tools without real storage devices. The first patch
is a preparation to make new feature addition simpler. The second patch
introduces the 'badblocks_once' parameter to simulate bad blocks
recovery. The third patch fixes a bug, and the fourth patch adds a
function argument to prepare for the fifth patch. The fifth patch adds
the partial IO support and introduces the 'badblocks_partial_io'
parameter.

Changes from v3:
* 4th patch: Renamed null_handle_rq() to null_handle_data_transfer()
* 5th patch: Improved comments of null_handle_badblocks()
* Added Reviewed-by tags

Changes from v2:
* 1st patch: Reflected comments on the list
* 2nd patch: Moved the 4th patch in v2 series to 2nd
             Reduced if-block nest level
* 3rd patch: Added to fix zone resource management bug
* 4th patch: Added to prepare for the next patch
* 5th patch: Rewritten to care zone resource management
             Introduced badblocks_patial_io parameter

Changes from v1:
* Added the first patch which avoids the long, multi-line features string

Shin'ichiro Kawasaki (5):
  null_blk: generate null_blk configfs features string
  null_blk: introduce badblocks_once parameter
  null_blk: fix zone resource management for badblocks
  null_blk: pass transfer size to null_handle_rq()
  null_blk: do partial IO for bad blocks

 drivers/block/null_blk/main.c     | 164 +++++++++++++++++++-----------
 drivers/block/null_blk/null_blk.h |   6 ++
 drivers/block/null_blk/zoned.c    |  20 +++-
 3 files changed, 129 insertions(+), 61 deletions(-)

Comments

Chaitanya Kulkarni Jan. 22, 2025, 1:01 a.m. UTC | #1
On 1/21/25 00:15, Shin'ichiro Kawasaki wrote:
> Currently, null_blk has 'badblocks' parameter to simulate IO failures
> for broken blocks. This helps checking if userland tools can handle IO
> failures. However, this badblocks feature has two differences from the
> IO failures on real storage devices. Firstly, when write operations fail
> for the badblocks, null_blk does not write any data, while real storage
> devices sometimes do partial data write. Secondly, null_blk always make
> write operations fail for the specified badblocks, while real storage
> devices can recover the bad blocks so that next write operations can
> succeed after failure. Hence, real storage devices are required to check
> if userland tools support such partial writes or bad blocks recovery.
>
> This series improves write failure simulation by null_blk to allow
> checking userland tools without real storage devices. The first patch
> is a preparation to make new feature addition simpler. The second patch
> introduces the 'badblocks_once' parameter to simulate bad blocks
> recovery. The third patch fixes a bug, and the fourth patch adds a
> function argument to prepare for the fifth patch. The fifth patch adds
> the partial IO support and introduces the 'badblocks_partial_io'
> parameter.

Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

-ck
Chaitanya Kulkarni Jan. 22, 2025, 1:02 a.m. UTC | #2
Replying to my own email :-

On 1/21/25 17:01, Chaitanya Kulkarni wrote:
> On 1/21/25 00:15, Shin'ichiro Kawasaki wrote:
>> Currently, null_blk has 'badblocks' parameter to simulate IO failures
>> for broken blocks. This helps checking if userland tools can handle IO
>> failures. However, this badblocks feature has two differences from the
>> IO failures on real storage devices. Firstly, when write operations fail
>> for the badblocks, null_blk does not write any data, while real storage
>> devices sometimes do partial data write. Secondly, null_blk always make
>> write operations fail for the specified badblocks, while real storage
>> devices can recover the bad blocks so that next write operations can
>> succeed after failure. Hence, real storage devices are required to check
>> if userland tools support such partial writes or bad blocks recovery.
>>
>> This series improves write failure simulation by null_blk to allow
>> checking userland tools without real storage devices. The first patch
>> is a preparation to make new feature addition simpler. The second patch
>> introduces the 'badblocks_once' parameter to simulate bad blocks
>> recovery. The third patch fixes a bug, and the fourth patch adds a
>> function argument to prepare for the fifth patch. The fifth patch adds
>> the partial IO support and introduces the 'badblocks_partial_io'
>> parameter.
>

For the whole series :-

> Looks good.
>
> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
>
> -ck
>
>